Against the Grain V36#3, June, 2024 Full Issue

Page 18

AI in Libraries


If Rumors Were Horses

Happy summer! The first day of summer is here! June in SC is hot, hot, hot, and the beaches are crowded, crowded, crowded. The Charleston area (Folly Beach, Isle of Palms, Sullivans Island, and Kiawah Island!) has been a popular vacation destination for years but it’s getting more and more congested. Here’s a fun guide to enjoying your summer in the Lowcountry from the Charleston City Paper. Where do you like to go to visit the beach? Or are you more of a mountain getaway type person? Either way, hope you get a chance to enjoy your favorite places this summer. We have a lot of Rumors so let’s get cracking!

Conferences and Events

Did you attend the SSP 2024 Annual Meeting? It was held May 29-31 in the seaport district of Boston. Leah Hinds, our Executive Director, and Don Hawkins, our Conference Blogger, were both there and report strong attendance and high energy and engagement with a strong program. Leah presented alongside Lisa Janicke Hinchliffe (University of Illinois

Empowering Researchers to Achieve More

Students and scientists face the constant challenge of keeping pace with the latest advancements and technologies. ACS In Focus, a comprehensive digital resource from the American Chemical Society, empowers readers to overcome these challenges by providing the tools they need to get up to speed quickly on emerging topics.

Find out more

Streamline Your Library Workflows

with the GOBI Library Technical Services

Spend less time on repetitive processes and more time on what truly matters: fostering a love of reading and learning in your patrons. GOBI Library Solutions offers a wide range of services to help optimize your library’s workflow.

Download your free copy of the GOBI Library Technical Services Guide today.

Download our guide to discover: How GOBI can simplify your workflows with services like:

• Book ordering

• Cataloging services

• Physical processing

The benefits of outsourcing technical services:

• Increased efficiency

• Cost savings

• Improved accuracy

How to leverage GOBI to meet your library’s unique needs:

• Tailored solutions

• Seamless integration

Against the Grain (ISSN: 1043-2094), Copyright 2023 by the name Against the Grain is published five times a year in February, April, June, September, and November by Against the Grain, LLC. Mailing Address: Annual Reviews, PO Box 10139, Palo Alto, CA 94303-0139. Subscribe online at membership-options/

Editor Emerita:

Katina Strauch (College of Charleston, Retired)


Leah Hinds (Charleston Hub)


Caroline Goldsmith (Charleston Hub)

Research Editor:

Judy Luther (Informed Strategies)

International Editor:

Rossana Morriello (Politecnico di Torino)

Contributing Editors:

Glenda Alvin (Tennessee State University)

Rick Anderson (Brigham Young University)

Sever Bordeianu (U. of New Mexico)

Todd Carpenter (NISO)

Ashley Krenelka Chase (Stetson Univ. College of Law)

Eleanor Cook (East Carolina University)

Kyle K. Courtney (Harvard University)

Cris Ferguson (Murray State)

Michelle Flinchbaugh (U. of MD Baltimore County)

Dr. Sven Fund (Fullstopp)

Tom Gilson (College of Charleston, Retired)

Michael Gruenberg (Gruenberg Consulting, LLC)

Bob Holley (Wayne State University, Retired)

Matthew Ismail (Charleston Briefings)

Donna Jacobs (MUSC, Retired)

Ramune Kubilius (Northwestern University)

Myer Kutz (Myer Kutz Associates, Inc.)

Tom Leonhardt (Retired)

Stacey Marien (American University)

Jack Montgomery (Retired)

Lesley Rice Montgomery (Tulane University)

Alayne Mundt (American University)

Bob Nardini (Retired)

Jim O’Donnell (Arizona State University)

Ann Okerson (Center for Research Libraries)

David Parker (Lived Places Publishing)

Genevieve Robinson (IGI Global)

Steve Rosato (OverDrive Academic)

Jared Seay (College of Charleston)

Corey Seeman (University of Michigan)

Bruce Strauch (The Citadel, Emeritus) Lindsay Wertman (IGI Global)


Bowles & Carver, Old English Cuts & Illustrations. Grafton, More Silhouettes. Ehmcke, Graphic Trade Symbols By German Designers. Grafton, Ready-to-Use Old-Fashioned Illustrations. The Chap Book Style.


Annual Reviews, PO Box 10139 Palo Alto, CA 94303-0139

Production & Ad Sales: Toni Nix, Just Right Group, LLC., P.O. Box 412, Cottageville, SC 29435, phone: 843-835-8604


Advertising Information: Toni Nix, phone: 843-835-8604


Send correspondence, press releases, etc., to: Leah Hinds, Editor, Against the Grain <>

Authors’ opinions are to be regarded as their own. All rights reserved. Produced in the United States of America. Against the Grain is copyright ©2024




Register now to take advantage of the Extra Early Bird rate and secure your place!

The annual Charleston Library Conference is the can’tmiss event where industry leaders, professionals, and innovators come together to explore the latest trends, share insights, and shape the future of our field.

Registration gets you two weeks of in-person and virtual conference sessions As always, our in-person gathering will be in gorgeous downtown historic Charleston at the Gaillard Center.


Full access to all conference sessions and the Vendor Showcase in Charleston

Complete access to our virtual conference platform

Morning and afternoon refreshment breaks

Tuesday Welcome Reception

Conference Reception at the South Carolina


Daily Friday continental breakfasts

Optional Dine Arounds

Conference materials including tote bag, notepad, pen, t-shirt, water bottle, and more

On-demand access to session recordings for three months following the event

Complimentary one-year subscription to Against the Grain journal

In-person and virtual networking opportunities

Virtual vendor landing pages and meeting opportunities

Virtual tours and chance to win prizes


In Person: November 11-15

Vendor Showcase: November 12

Online: December 9-13


Extra Early Bird: $495 (June 13 - July 12)

Early Bird: $510 (July 13 - Sept 27)

Student Rate: $95 (no deadline) See website for more


Proposal deadline July 10. We’re seeking proposals for concurrent sessions, lively discussions, lightning rounds, posters, and more Student submissions are welcome!

Submit your ideas here!

From Your (remembering the good times) Editor Emerita:

2024 is flying by! I am still working on writing my memoir, wracking my brain to remember all the significant events that happened from 1980 when the Conference started to now when we are part of Annual Reviews. The awesome team of Charles Watkinson and Jason Colman at The University of Michigan have me on a tight deadline! Meanwhile my gorgeous and loving Jack Russell terrier Circe had to go to doggie heaven and I am still teary! Rest in peace, darling Circe. But I am greatly looking forward to this fabulous ATG June issue. Aren’t the splendiferous Leah Hinds, Toni Nix, Caroline Goldsmith, et al, doing a marvelous job of keeping us informed?

This issue on AI in Libraries is guest edited by Vessela Ensberg and Peter Brantley, both of the UC Davis Library. They’ve both poured a lot of preparation and work into crafting an excellent coverage of the topic, with a broad range of featured authors. They paid attention to every detail — down to the font used for the article headers (don’t you love the futuristic look?) — and ensured that each of the featured articles was made openly available under a CC-BY license.

Included for your reading pleasure is an interview with powerhouse Gary Price, discussing trust and transparency, citations and provenance, and literacy for using AI among other things. Lucy Lu Wang, from the Information School and University of Washington, discusses generative

Letters to the Editor

AI for information access. Lorcan Dempsey, also at the iSchool at UW, tells us about Libraries, AI and Training Collections. Brandon Butler, recently named Executive Director at ReCreate Coalition, talks about some of the legal issues surrounding the training of generative technologies. Francesco Ramigni of the ACMI, Australia’s Museum of Screen Culture, tells us how textto-image/video generative AI may help libraries and museums collections accessibility, and includes some fascinating videos embedded as part of the article. And lastly, Shawn Averkamp from AVP celebrates embracing serendipitous innovation in libraries.

From our regular columnists, we have another amazing installment of the Readers Roundup reviews, edited by Corey Seeman, Director, Kresge Library Services, Ross School of Business, University of Michigan. In addition to being famous for his squirrel photography side gig, Corey keeps us all informed with reviews of reference materials ranging from “I need this book on my nightstand” to “I’ll use my money elsewhere.” Corey’s editors note in this installment is especially poignant given the theme of this issue — be sure to give it a read!

There’s lots more to unpack in this issue, but I’ll end it here and let you get to reading!

Love, Yr.Ed.

Send letters to <>, or you can also send a letter to the editor from the Charleston Hub at

Hi Katina, Tom and Leah:

I hope all has been well with you since we were last in touch. After four decades at the Library of Congress, I will be retiring in a few weeks and have taken a look back with a view on technology changes and major disruptions. The result is the attached article, “Forty Years at the Big Library,” which I am


June 2024 04/11/24 04/25/24

September 2024 06/06/24 07/11/24

November 2024 08/15/24 09/12/24

submitting for your consideration.

Thanks! Joe

Joe Puccio (Retired, Library of Congress) <>

Dear Joe,

Thank you for sharing a look-back on your career with us, and congratulations on your upcoming retirement! 40 years is certainly a huge milestone and something to celebrate. We’re happy to include your special report in the June issue just in time to officially announce your official retirement on June 22.

Best regards,

Leah Hinds (Executive Director, Charleston Conference, Charleston Hub) <>

Editor’s Note: Readers — be sure to check out Joe’s article on page 62, and his blog Joe’s Lessons Learned!


A Transformative Model for Open Access

• Unlimited Open Access publishing for all corresponding authors in ACM’s magazines, conference proceedings and journals

• Unlimited access for all authorized users to the full-text contents of the ACM Digital Library

• Fixed pricing for the length of the multi-year license term

• Automatic deposit at publication of accepted research articles into an institutional repository

• Default CC-BY author rights on all accepted research articles

Rumors continued from page 1

Urbana-Champaign) for another installment of the Charleston Trendspotting series. This interactive, hands-on session provides a chance for attendees to discuss trends and issues impacting the world of libraries and scholarly communication and use structured brainstorming activities to forecast possible futures (plural!) and their desirability for the industry. Each year includes a different type of small group activity to foster discussion, and this time the groups completed a “Flipping the Facts” activity. Join the upcoming Trendspotting workshops at the 2024 LIBER annual meeting (July 3-5, Limassol, Cyprus) or at the 2024 Charleston Conference (November 11-15) to experience it for yourself and contribute to the conversation!

The 2024 NASIG Conference was held June 3-7 in Spokane, WA, their 39th annual event. The NASIG Blog has excellent dayby-day coverage and photos of the event, with fun highlights such as the launch of the Vendor Expo, attending a baseball game, late night socials, jigsaw puzzles, and more. Courtney McAllister, one of our amazing Conference Directors for Charleston, is the outgoing President of NASIG. We’ve asked her to write a short summary of the event and lookback on her tenure, so stay tuned for more!

and General Counsel and Secretary of Artstor along with Roy Kaufman, Managing Director, Business Development and Government Relations, Copyright Clearance Center, moderated by Ann Okerson, Senior Advisor, Offline Internet Consortium.

Hotel rooms in Charleston are always at a premium, unfortunately, but this year in particular there’s another large event in town at the same time as the conference (a national real estate group) so please please be sure to book your room early. We have guest room blocks at many area hotels in addition to our headquarter locations at the Francis Marion and the Courtyard by Marriott.

The Call for Papers is still open through July 10! We’re seeking proposals for concurrent sessions, lively discussions, posters, and more. There are opportunities to present either in-person or virtually. Special attention will be given to proposals that:

• are oriented toward providing practical, concrete information for practicing professionals;

• demonstrate innovative or entrepreneurial thinking;

• that include a diverse representation from the different viewpoints and stakeholders in the scholarly communications process;

Plans for the 2024 Charleston Conference are well underway! Our theme this year is “The Sky’s the Limit!” and I love the optimism. Registration opened for the Vendor Showcase on June 11, and we sold out over 60% of capacity in the first day! Be sure to register soon to join us — don’t miss your chance! Attendee registration also opened on June 13, and folks are taking advantage of the extra early bird rate that’s available through July 12.

Our confirmed keynote and plenary speakers include:

• Fireside Chat with Katina Strauch, Founder and Executive Advisor, Charleston Conference, and author Richard Charkin, Founder of Mensch Publishing

• Dr. Tonya Matthews, President and CEO of the International African American History Museum in Charleston

• Long Arm of the Law Panel: Nancy Kopans, Vice President, General Counsel and Secretary of ITHAKA

• generate ideas or report research that contribute to ongoing discussion about the future of the library and information industry;

• present strategies for effectively implementing new ideas and technology; and

• encourage active learning among conference attendees.

We’re also encouraging student participation and we have a special, low student registration rate. Be sure to share with any library or information school students you know! We even have a few student volunteer spots available for a free registration in exchange for helping on site at the event. Contact Christy Anderson <> for more information on the student volunteer program.

We’re also excited to bring back the Charleston Premiers for another round of “Best of” audience awards. The Premiers are five-minute previews of the new and noteworthy — an opportunity for publishers and vendors to showcase their newest and most innovative products and resources for the library community. The lightning round presentations are followed by audience Q&A and voting for Best Design, Most Impactful, and Overall Best New Product categories. Applications are due by August 2, and the Premiers will be held at the Charleston


The two must-have academic platforms are now one company.

Now you can access Kanopy’s exclusive films and OverDrive’s millions of ebooks and audiobooks by chatting with a single representative. It’s never been easier to build your custom collection of sought-after films and essential titles that your students and faculty need.

Chat with your account rep or scan the QR code to start expanding your offerings today.

Conference on Thursday, November 14, time and location TBA. Darrell Gunter will once again be our emcee extraordinaire — he is just masterful with the way he organizes and runs the sessions, including astute questions for each presenter, and running the audience voting procedure. Ask Darrell <d.gunter@> if you have any questions about the application process!

In Memoria

Richard Brown of Bloomsbury shared this sad news on LinkedIn: “Dear Publishing Friends, I’m heartbroken to report that Patrick Alexander, who retired as director of Penn State University Press a few months ago, died recently. He was a remarkable publisher, friend, and human being. Patrick and I began our careers in religion publishing at the same time before moving to university presses, and his contributions to both communities were powerful and enduring. One of the smartest, funniest, most caring and compassionate people I’ve ever had the privilege to know. Such a huge loss for our industry, and for humanity. May he rest in peace.”

Circe, a small broken coat black and white Jack Russell Terrier, was our kind and faithful companion for many years. A memory from Leah: “I remember when I first met her back when they lived on campus at the Citadel, before moving to Sullivans Island. She was a shy, scared rescue dog; a Jack Russell terrier who barked at everyone who came to the house. We were instructed to give her a treat (usually a small piece of bacon, lucky dog!) every time we visited so she would come to trust us and even

enjoy our intrusions into her home. Over time, she became more and more comfortable with us and would even sit next to me on the couch sometimes. She was an ever-present figure in the background of all of Katina’s Zoom meetings. She had a chair with a cushion right behind Katina so she was always in frame, and although she mostly slept there towards the end, she would occasionally contribute to the conversation with a friendly bark or the jingle of her collar as she scratched or adjusted positions in the chair. Due to her declining health in old age, Katina and Bruce had to make the difficult decision to have her put to sleep on June 5, 2024. She will be greatly missed by us all!”

I confess. I love baseball. We didn’t have a tv until the late 50s, so radio was the best way to hear about baseball games. I was sad to learn of the death of Willie Mays (May 6, 1931 – June 18, 2024), one of the greatest baseball players of all time. My brothers and I used to pull for the Cincinnati Reds, but we thought Willie Mays and the San Francisco Giants were way cool. Mays’ nickname was the “Say Hey kid” but supposedly he never used those words in public. The “Say Hey Kid” was 93. You can see May’s life in photos here

By New York World-Telegram and the Sun staff photographer, William C. Greene.

That’s it until next time! Thanks for reading and remember to send your Rumors to <>.


Bet You Missed It — Press Clippings — In the News

Carefully Selected by Your Crack Staff of News Sleuths

Try a Detective Tour

If looking to avoid tourist mobs and get to the heart of a city, or the underbelly, you need the gritty guide of a detective novel. In those novels, “cities are evolving multifaceted entities as compelling as any human character.”

In Bari, a city in the heel of Italy’s boot, the author uses Gianrico Carofiglio’s The Cold Summer as his guide. The wine bar La Staffa Enoteca leads to a chance encounter with the “best natural winemaker in Puglia.”

For Rio’s Copacabana, his guide is Luiz Garcia-Roza’s December Heat and Pursuit. This leads him to the Baalbeck food stall for a snack of kibe, the Lebanese lamb and bulgur which has become thoroughly Brazilian. He goes to the beach at sunrise to see white-clad Afro-Brazilians worshipping in the waves.

In Dublin, he uses Benjamin Black’s (John Banville) The Lock Up for whispers of the 1950s city. Bewley’s café for rich roasted coffee. For dimly lit pubs, Ryan’s of Parkgate Street and Doheny & Nesbitt.

In L.A., Steph Cha’s Follow Her Home takes you to modern Koreatown, but with a noirish past. Prince, a bar with the décor of Marlowe’s City of Angels, was used for a scene in Polanski’s “Chinatown.”

See: Tom Downey, “Crack the City Wide Open,” The Wall Street Journal, May 11-12, 2024, p.D1.

Marketing Misty

Marguerite Henry wrote 59 books, many of them bestsellers. Her most famous, the 1947 Misty of Chincoteague was both a best-seller and a masterclass in book promotion.

The semi-true story tells of a boy and a girl adopting a wild pony from a Virginia island. The true part is there was a real horse, and Marguerite capitalized. She taught the pony tricks and hosted birthday parties for her on the lawn with hundreds of invited children.

They travelled the land of bookshops with the illustrator who drew pictures for book-buying children. Misty appeared at the 1949 ALA conference. Misty’s return to Chincoteague and foaling of a colt were press events. And of course, Marguerite produced Stormy, Misty’s Foal which was a best-seller.

All this was before childrens’ books were actively marketed.

Her publisher Rand McNally put out a newsletter in the 1960s for devoted fans so Marguerite could answer reader questions and hype coming books like Brighty of the Grand Canyon; Gaudenzia, Pride of the Palio; White Stallion of Lipizza.

See: Lettie Teague , “The Marketing of ‘Misty of Chincoteague,’” The Wall Street Journal, May 11-12, 2024, p.C5. Teague is a wine columnist for the WSJ and author of ‘Dear Readers and Riders: The Beloved Books, Faithful Fans and Hidden Private Life of Marguerite Henry,’ coming in May from Trafalgar Square Books.

Revolutionary Mode

During the Terror of the French Revolution, the Marquise Térézia Cabarrus was arrested, stripped of her layered petticoats, bone stays, and three piece silken gown and given a tattered chemise to wear. In a filthy cell, she watched as each day prisoners were taken to the guillotine.

By chance, the counter-revolution executed Robespierre and his gang, and she found herself alive, free, but penniless. Her friend Rose de Beauharnais was in the same pickle.

They hit upon fashion design and chose the prison chemise as the new Paris chic. It was oddly in line with the current zeitgeist.

During the Revolution, neo-classicism became the reaction to the prior Rococco style. Jacques-Louis David was the popular artist who painted scenes of Romans and Greeks.

Using diaphanous Bengali muslin, the two pals designed flowing gowns with high waists fitted beneath the bust. The slinky, sexually suggestive dresses created a sensation that spread from “Moscow to Lima.” Never in history had such a radical change in women’s styles occurred.

Térézia took rich and powerful lovers, had 11 children by five men, and ended up a Princess. Rose married a mesmerizing young army officer named Napoleon Buonaparte. She would become the Empress Joséphine.

See: Anne Higonnet , “Les Robes Dangereuses,” Town & Country, March, 2024, p.124. The article is an excerpt from “Liberty, Equality, Fashion: The Women Who styled the French Revolution,” by Anne Higonnet, published by W.W. Norton & Company.

Obit of Note

Bob Edwards (1947-2024) loved radio from an early age. In the army, he worked for Armed Forces Radio and Television. He earned an M.A. in journalism from American University and went to a tiny station in Illinois. In 1974, he landed a spot at NPR where he was a co-host of All Things Considered.

Moving to Morning Edition, he spent 25 years wowing his listeners. His resonant baritone became a morning routine for fans who all felt they knew him personally.

When he was ousted in 2004, fans went berserk with complaints. Sen. Dick Durbin denounced the move from the Senate Floor.

Bob spent a further ten years at XM Satellite Radio. He felt bitter about his firing, but was always reverent towards public radio. He still listened to his old show with the new anchors every morning.

See: “The radio host who was the voice of the morning drive,” The Week , March 1, 2024, p.35.


Let’s Read Fly Fishing

(1) Luke Jennings, Blood Knots (2010) (an Englishman obsesses over fly fishing and declares it connects you with your own deep history; Jennings is the author of novellas that spawned the “Killing Eve” TV series); (2) Thomas McGuane, The Longest Silence (1999) (lyrical reveries that are “the benchmark of modern American fishing writing.”); (3) Nick Lyons, Spring Creek (1992) (month spent fishing the paradise of a private river in Montana; puts intense drama in the landing of a trout); (4) Chris Dombrowski, Body of Water (2016) (saltwater angling in the Bahamas; bonefish “the silver ghost” of tidal flats); (5) J.R. Hartley, Fly Fishing (1991) (lost youth and the charm of English chalk streams).

See: David Coggins, “Five Best,” The Wall Street Journal, April 20-21, 2024, p.C8.

Love Among the Capitalists

The annual Berkshire Hathaway shareholders’ meeting in Omaha, NE is often called “Woodstock for Capitalists.” And wouldn’t you guess that strategizing on the big bucks brings thoughts of romance. And with that comes proposals of marriage.

The Berkshire owned Borsheim’s Jewelry is conveniently handy, and Warren Buffet himself has been known to man the counter. He has even alloted time for marriage proposals right on the floor of the meeting.

See: Karen Langley, “Annual Meetings Can Be Romantic at Berkshire Hathaway,” The Wall Street Journal, May 2, 2024, p.A1.

Son of a Fiddler and Ramblin Man

Dickey Betts (1943-2024) was born the son of a fiddler in West Palm Beach. He learned the ukelele by age 5, expanded to mandolin and banjo, switched to guitar to impress girls.

He met Duane and Greg Allman and they began a tempestuous rock career. Duane was killed on a motorcycle in 1971. The band repeatedly broke up and got back together. Betts was drugs and booze and raising cain. His long hair and wild manner made him the model for the guitarist in Almost Famous.

When it’s time for leavin’ I hope you’ll understand.

Lord I was born a ramblin’ man.

See: “The guitarist who was born a ‘Ramblin’ Man’, The Week, May 3, 2024, p.35.

More Obits of Note

Ira von Furstenberg (1940-2024) was born to a poor nobleman father and a mother whose family owned Fiat. She married a prince at 15, but was soon divorced. The epitome of the Jet Set, she travelled the globe as her marriages and passions were eagerly covered by the gossip papers.

She appeared in 25 films, helped launch Karl Lagerfeld, and ran the perfume division for Valentino. She said she spent so much time on planes that her children suspected she was “really a flight attendant.”

See: “The socialite who lived a public life of glamour and heartbreak,” The Week, March 8, 2024, p. 35.

Laurent de Brunhoff (1925-2024) was five and couldn’t sleep. His mother told him a story of an orphaned baby elephant who travels to Paris, adopts French ways, and returns to the jungle to be crowned king. Laurent and his brother demanded illustrations of their father Jean, who obliged. Soon, there were six Babar books.

Jean died young of TB in 1937, and Laurent took up brush and pen to turn out 40 more Babars over 60 years.

Laurent trained as an artist and for a while painted abstracts. When he applied himself to Babar, critics compared it to naive artists like Henry Rousseau.

Of course, starting in the 1960s, the modern day pecksniffs had to find Babar an allegory of French colonialism and condemn the unearned wealth of the Old Lady, Babar’s elderly Parisian benefactor. Laurent replied he was writing stories and not social theory.

Babar came out in 18 languages, had a 1989 movie and multiple TV adaptations. Laurent declared Babar his friend. He didn’t think about the children who would read the books. “I wrote for myself.”

See: “The French author who popularized Babar,” The Week, April 5, 2024, p.35.

Duane Eddy (1938-2024) grammy-winning guitarist is dead at 86. A pioneer of rock, he was known for his instrumental “twang.” He had 16 top-40 singles and sold 100 million records. He began playing guitar at the age of 5 and had his breakout single “Rebel Rouser” in 1958. The following year he played Henry Mancini’s theme song for TV series Peter Gunn. Then scored music for movies “Because They’re Young,” “Pepe,” and “Gidget Goes Hawaiian.”

He inspired George Harrison and Southern California surf bands of the ’60s.

See: Duane-Eddy-guitarist-Peter-Gunn-dead.html


... but this is wondrous strange (Hamlet, Act 1, Scene 5, line 171)

and Peter Brantley (Director, Online Strategy, University of California Davis Library) <>

Excitement and apprehension have been surrounding the explosion of accessible generative AI in the last few years. As we finished putting together this issue, the field of generative AI remains very dynamic. We tried to capture how the latest development affects libraries, and the vision, challenges and opportunities of how libraries can affect that development.

Before we dive into what is probably the most discussed technology today, it may be helpful to reflect on evolving perceptions around similar groundbreaking innovations as a guide to managing generative AI. When the library community encountered Google search and Wikipedia, some of the initial reactions may have included skepticism and caution. Today, Google Scholar is considered a good place to initiate a search, and libraries have set up Wikipedia editathons and hired Wikidata librarians. The initial assessments of the tool limitations were right at the time, but perhaps the potential was not sufficiently appreciated. With any new technology, it is helpful to assess what functionality is needed to move it from limiting to useful, and make a strong case for prioritizing that.

There are signs that we may be descending into the “Trough of Disillusionment” of the hype cycle1 for AI; professionals seem to have moved from being surprised by hallucinations to checking for them, and from praise to growing criticism. In this issue we want to acknowledge generative AI’s limitations and at the same time recognize its potential.

Some of the disillusionment may be due to misunderstanding of what can be expected of AI models. Many of our writers touch on the general workings of generative AI, but it’s worth summarizing the key concepts here. Generative AI is a complex neural network algorithm that predicts what element is most likely to belong in a sequence.2 For language models, the elements are words; for diffusion models, they are pixels. The statistics are derived from training datasets — collections of texts and/ or images–and the model outputs result in novel, often fascinating–texts and/or images. The bigger the model, the more data and computational power are required to train it and the higher the environmental and financial costs. The datasets and their inherent biases, discussed in more detail in Lucy Lu Wang’s article, are a consistent source of concern. Finally, the models are non-deterministic, which means that answers to the same question or prompt will vary to some degree, and there are parameters that can be modified by users to make them more or less so. Because of the probabilistic nature of the algorithms, they are prone to making statements that may not reflect reality and facts.3 The complexity of the models makes it difficult, if not impossible, to understand how they derived a particular output. It is as if we arrived at the answer “42” in the “Hitchhiker’s Guide to the Galaxy”; the question remains to be computed. Reducing hallucinations and understanding model outputs is in itself a research field. Generative AI is used in a number of disciplines from finance to medicine, and recently researchers

in clinical imaging have made some progress in understanding breast cancer risk forecasts.4 Successful application of the methods in other models will be a significant advancement in understanding their outputs. Meanwhile, adding retrievalaugmented generation to the model is an increasingly popular approach to try to establish the provenance of an output. The approach provides an additional set of documents to the algorithm to enhance inferencing on the source material.5 However, the original training and the resulting parameters are still defining the model, so while hallucinations are reduced, they are not eliminated.6 This result is disappointing if the models are expected to perform perfectly, but, like Google search and Wikipedia, generative AI is a tool, and the way to deal with failings of a tool is to understand and attempt to fix them, rather than reject the tool altogether.

To use a different metaphor: cars are dangerous. If you have a teenager, or if your neighbor does, you probably had an uneasy feeling when they got behind the wheel. But you likely still own a car, and feel better that between 1886 and now cars have been equipped with seatbelts, and airbags, and engine control units. The authors in this issue flesh out the safety equipment for generative AI as they reflect on how libraries should work with the tool.

Several of our authors remark on the value of curation and evaluation of content. Gary Price explores the challenges and opportunities for our community to develop “driver’s ed” for AI. His piece focuses on what it takes to achieve trustworthiness of output. His advice to information professionals is to focus on life-long learning and to embrace curation, advocacy and verification. Lucy Lu Wang makes a complementary case in her piece of “Generative AI for Scholarly Information Access,” emphasizing the connections libraries foster with their communities as crucial for equitable curation. Her article addresses concerns about biases introduced through training materials.

Similarly, Lorcan Dempsey urges libraries to actively engage with shaping the norms around AI use. He also invokes curation of collections as a major leverage point in directing and correcting the course of development of generative AI. After all, if John Wiley & Sons can receive $23MM for the use of its collection for AI model training,7 what possibilities might arise if libraries (and publishers) coordinated efforts? What if libraries defined clear criteria and expectations for models they would be willing to endorse? What is our responsibility in reducing perpetuation of bias? Treating AI as a tool means to accept a challenge and a responsibility: seeing AI’s potential is on us.

In his article “Some Next Generation Legal Issues for Training Generative Technologies,” Brandon Butler boldly goes into mapping the legal landscape around training AI on copyrighted works. His discussion of the pending copyright infringement


Did you know…

97% of the 100 Best MBA Business Schools, according to the Financial Times’ 2023 Ranking, subscribe to INFORMS journals?

88% of the Top 100 Global Engineering Schools , according to U.S. News & World Report 2023-2024 Ranking , have INFORMS journals in their collections?

Expand your holdings with INFORMS journals. Our titles cover important operations research and analytics discoveries and content that meets the needs and interests of researchers, practitioners, students, and business leaders.

Preview all 17 INFORMS journals and learn how to save over $4,000 on the entire INFORMS PubsOnline Suite.

lawsuits against generative technology companies brings into focus another concept of fundamental importance to libraries: fair use. Brandon outlines the future impact that the outcomes of such lawsuits can have on organizations beyond the original litigants.

Francesco Ramigni gives us another glimpse of what the future may hold through his vision of contextualized search. In his piece “Visualising Collection Records With The Use Of Generative AI,” Francesco describes the integration of content from different types of media to transform the catalog from a list of records to a collection of trailers. He walks us through some of his own experimentation and outlines the long road to achieving his vision.

Shawn Averkamp’s motivational recount of her experience in working with libraries on incorporating machine learning into library operations encourages “Embracing Serendipitous Innovation in Libraries.” Her recipe of approaching work with AI tools as an experimentation, with curiosity and open mind, and most-importantly, with human-centric design, may be the key to unlocking that potential.

When reading about the challenges in getting the right output from AI tools, it may be helpful to take Gary’s advice on looking at how other fields use AI. The machine-learning algorithm AlphaFold2, which predicted protein structure with close to 90% accuracy,8 enabled the field of structural biology to leap into the previously impossible. The Levinthal’s paradox of protein structure was that it would take longer than the age of the universe to arrive at it through brute-force sampling.9 Reflecting on this feat is a motivation to keep exploring new applications for AI. If we choose carefully the seemingly impossible transformations that AI tools enable, we may very well be looking at a new golden age for collections, resurrecting people and their stories.10 As we readied this editorial, the Association of Research Libraries and the Coalition for Networked information published a report on AI scenarios, one of which explores this resurrection as an aptly named application LAZARUS.11

There are many other topics and visions that belong in this issue. We wish we could have had a discussion on the ability of AI tools to synthesize ideas at scale or to see connections in disparate contexts. The effect those functionalities will have on the perception of creativity may be similar to the way photography both supported the development of realism as an artistic movement 12 and catalyzed the emergence of impressionism.13 One cannot but wonder what the equivalent of Monet’s “Sunrise” and Pissaro’s “Boulevarde Montmartre” will look like in the wake of generative AI. We plan to reread this issue, with some trepidation, in a year to see what has come to pass and what else we have missed.

Meanwhile, we invite you to immerse yourself in the insights our authors have to share today. They approach generative AI tools with eyes open to their deficits, but also with many ideas on the roles for librarians and libraries that they open up. They identify the need for visionaries and trustworthy curators who represent community interests, whose expertise spans subject knowledge and technology, who build training collections, who explain benefits and pitfalls of using artificial intelligence for a particular task, and who are sensitive to serendipity. We hope you find their insight useful, and even inspiring.


1. gartner-hype-cycle



4., https://





9. proteins_levinthal_1969.pdf , https://misciwriters. com/2017/03/14/computing-levinthals-paradox-proteinfolding-part-2/

10. Some interesting work came close to exploring that at the Fantastic Futures 2023 conference.



13. impressionism-the-influence-of-photography/


An Interview with Gary Price

and Peter Brantley (Director, Online Strategy, University of California Davis Library) <>

Claude Opus 3 edited transcript with revisions by Vessela Ensberg, Peter Brantley, and Gary Price.

Vessela Ensberg: Gary, can you talk about how a user can trust the model they’re using and what creates that trust or causes a lack of trust in it?

Gary Price: When it comes to trust in AI, a key point is that words matter. Our community has a huge role in defining things properly. Should we focus this discussion primarily on GPT models or AI as a whole? People may be confusing the definitions.

VE: Do you think there’s a difference in terms of what is trusted, whether it’s AI or GPT?

GP: It depends on how you’re interacting with it. With a self-driving car, for example, you have to trust the car designers, software developers, and their third parties to auto-drive the car and get you to the right place. You have little control as a user.

VE: For this interview, let’s define AI as generative AI that has been released to the general public, not specialized applications confined to particular companies.

GP: Trust comes from experience, use, reputation, and many other variables. If I’m writing a paper using GPT for research, do I trust OpenAI, Anthropic, or Perplexity to get it right without extensive review? At this point, I don’t.

VE: What would make you trust it more?

GP: Time. Research. People who use and research AI tools over time see improvement and growth in the product and in the service. The example I use is very simple: Would I trust ChatGPT to research facts and produce a slide deck with charts and graphs of the results for my boss to present?

I have run into any number of situations where a simple, well documented fact is presented incorrectly. Then, if I press the AI with a possible correction, it will respond: “Your answer X is true.” The precise words in the prompt matter!

VE: Do you have certain benchmarks or points you run through to determine how much you trust the output?

GP: One of the biggest things, which is not readily available in many situations, is the provenance and transparency of the training data. We don’t know where any of this is coming from. As a researcher, I can’t tell my boss where the data is sourced from. That can lead to a slippery slope in terms of output quality.

It is important to remember that I am thinking this way because of my education and training and experience, but a lot of people don’t. People take whatever they get - a concept called “satisficing” that arose around 2005-2010. In many cases, good enough is often good enough, but even then, if the sources could be referenced in a bibliography, or a webliography, a reviewer could see if it’s from reputable places.

I was just reading an article about the growing amount of AI content out there: AI generating an article after article after article. Within seconds, you can create a lot of junk. Now with the growing amount of AI-generated content farms, a GPT model could be sucking in a lot of misinformation or disinformation. If I’m researching with GPT and have no idea where facts are coming from, it makes replication, which is already often poor, even more challenging.

Over time, if you understand how a model is put together and what’s going into it, the trust level may increase.

Peter Brantley: So transparency into model inputs would foster some greater sense of trust or understanding, but I also assume you wouldn’t argue that transparency is sufficient for trust? Simply knowing what training data are used doesn’t necessarily invoke adequate trust levels?

GP: Absolutely, it’s just one of many variables that can help gain trust. That is not any form of guarantee of trust. Also, most major GPT providers today are large companies. We need to consider the business implications of putting trust in them. We know they have distinct profit motives, values. That’s another variable in the trust equation. That also leads to questions of trust regarding how they are using this data.

VE: You touched on citations and provenance for how responses are derived. What about citing the model used? Does a citation of a large language model response allow for reproducibility — and does it need to?

GP: Even if you know all the data going into the model, I don’t think that guarantees a reproducible and reliable result over time for users. There are too many variables — not just the underlying data but how prompts are worded. There are so many things that can affect reproducibility.

PB: Models are constantly being modified by changes in system prompts as well as in user prompts, and in various tweaks, potentially in parameterization. Technically, guaranteeing reproducibility may not be feasible currently. Given the need for reproducibility in research, how do we understand working productively with these models? What does that look like?

GP: The way for librarians and publishers to encourage trust and reproducibility is to have a hand in developing these tools and models. I don’t see any other way in having a say in how these things are developed. The growth of RAG (RetrievalAugmented Generation) and RAFT (Retrieval-Augmented Fine Tuning) will also be ripe for librarian involvement. There’s also an opportunity now for developing public models or ones for specific needs and user groups. This is the only way we can have the control we want to have. The issues that our community had with Google are minuscule compared to the issues we have here.


VE: You are touching on a lot of topics here. Let’s break them down and talk about Google. What did we learn, and how is it different now?

GP: One of the things I learned from Google was the value of being first in the technology cycle. Now when people are thinking about GPT, they are not thinking about Google any more; they are thinking about ChatGPT, OpenAI. Google is a major player with Gemini and the funding of Anthropic, but for the average user “ChatGPT” has become a verb.

With Google, you type in your keywords. At least with Google you are given a source. The quality of the source, the currency of the source — that is another matter. Of course this is, at least at the moment, beginning to change. AI summaries are coming to Google results pages and you’ll need an extra click to get to “traditional” Google results (or so we’ve been told). With LLMs you are getting a nicely-organized result that is ready to go; you can paste it in. People don’t have a lot of time; they want what they want when they want it. ChatGPT outputs, for most intents and purposes, are ready to go. As LLMs develop, we are seeing more web citations being supplied with these new services, but given all that we know about the manipulation of web results, this is a slippery slope .

VE: Is it ready to go, or is it tricking you into thinking it is ready to go because it sounds so smooth?

GP: No, you’re right, and I should be clear. It’s tricking you into thinking that it’s ready to go. I don’t know if most people know it is tricking you into thinking by its mechanism. Do people know enough to verify any of those things? This makes me think about how we are educating people about information digital media literacy. This education is not where it should be. You don’t know what you don’t know. And to verify, you sometimes need to know what you don’t know.

VE: So how do you become a literate user?

GP: The first thing is to know about all of its problems, and to be aware of alternatives, whether it be another GPT model, or whether it be Google, or whether it be Semantic Scholar, or whether it be one of the hundreds of databases that UC Davis, for example, licenses.

You can tell me if I am right or wrong, but most people are starting with Google Scholar or perhaps even just with Google. Yet if they knew one or two of the databases that you’re spending a lot of money for, they could get a better, more precise answer in a shorter amount of time. Tom Mann, who worked at the Library of Congress, talked about the principle of least effort in his book, A Guide to Library Research Methods

GPT takes that to a whole new level because after just typing in a few words, it might be tricking you into believing that you’re done. Yet, people can’t use what they don’t know about. The first thing I would do as an information professional is be aware of the alternatives. If I need to hammer in a nail, I’m not going to use a screwdriver. I think most database resources are probably unknown to a lot of people who would benefit from them, but they are hearing about Google. They are hearing about ChatGPT. So that’s what they use. Even if you have access to one extended model via a license through your company or institution, I strongly recommend using other LLMs, at least from time to time, to compare responses. Poe, a freemium service, makes this easy and fast.

VE: How much does the ease of use factor in here? How much is it the fault of the user defaulting to the easier tool, and how much is it the fault of information resource companies not making their products user friendly?

GP: Google made it so easy. It’s that one box; you type in something, and you always get something. I think that’s the exact same thing when you’re interfacing with Chat GPT. There’s the box you type something in, and you don’t even have to click on any links to verify or to look at it. You get something back in a nice paragraph. I would agree with you. Ease of use is a big part of this.

Google led to improved ease of use of traditional database providers. But to use them at an advanced level, there is a learning curve. Maybe putting GPT over it will make it easier. I don’t know of any way of getting around the learning curve, if there’s a need for one. When you get to the core of what can motivate many people to use this technology, it’s saving time and effort. Asking people to take time to learn (often on their own) how to best use the technology and to have a very basic level of understanding can be a challenge.

VE: Based on what you just said, what are the most important roles for information professionals today and in the next few years?

GP: Number one: educator. Another huge role is to try to get a seat at the table and share your views or the community’s views. That’s easier said than done. Similarly, a huge opportunity that will only get better with time is being a curator. Everybody in our community believes that it’s garbage in — garbage out; good stuff in — good stuff out. The role of the curator to help build these large language models with good data, with quality data from quality sources is one. Another is knowing the right sources to help select the databases, the PDFs — and not only text materials, but audio and video that the large language model will use RAG ( Retrieval-Augmented Generation ) to query at the time of need. A third role is building curated GPTs for instruction for a specific class, for a specific faculty member, for a specific Ph. D. student. Not only building them once, but helping to keep these GPT models as current as possible, reviewing them, interacting with the patrons on a regular basis, and finding out if they are getting what they need. The role of the curator has a huge opportunity here.

VE: Building on that, what are the concepts and tools and ideas that we need to develop for fact-checking? Do we need a trusted clearinghouse? What are your thoughts on that?

GP: Our community would benefit from a nonprofit entity that would take what I’m doing and expand it. There’s just too much information. There’s too much data. You need to be looking at how these tools are being used in other areas beyond libraries and publishing, in other communities. Don’t always read what you want to read; read beyond it, and get a broader view, stimulate more ideas.

VE: I’m hearing two things here. One is learning from other fields, adopting practices. The second one I think you’re describing is the libraries being keepers of a small curated body of knowledge for critical applications.

GP: Picture an Information Clearinghouse in which a person tests how different GPT models provide different results and finds which one might work best for you. Information technology can play a big role in giving people the information they need, both directly related to their field and from a tangential field that could create new ideas and new thoughts, in a format and level that they can absorb.

VE: What is a good model to use to teach AI literacy? There are standards for information literacy, and data literacy. What is different about artificial intelligence literacy?


GP: The volume of material; the rapid change. The privacy aspect needs to be part of AI literacy.

You need the same thing that makes you a safe driver of a car. You don’t need a PhD-level understanding, you need to teach driver’s ed: a basic understanding of how a car works, identifying road signs, operating the vehicle safely around other people and other cars. There needs to be an “AI driver’s ed”: how to operate AI, what to expect from it. Constant updating. Awareness of where it works well and of its limitations. Awareness of business implications, given that most of the development is done by three large companies. Customize your examples to the audience.

PB: Generative AI systems make associations between bits of information, but, more broadly, fields of knowledge that otherwise humans might not have associated. The machine, deriving insights into natural systems, acquired knowledge that otherwise would have been opaque to the traditional trajectory of human research. Examples range from DeepFold’s extrapolation of protein folding to the surprise of machinedriven moves of a Go game.

GP: If an AI model makes a scientific discovery by identifying new connections, doesn’t that still need to be experimentally

verified and replicated to be considered valid knowledge? The AI can generate hypotheses and insights, but other methods are needed to confirm accuracy and legitimacy.

I think there’s an opportunity for librarians to do fact checking on model responses. Given the volume, it would be challenging. It would be interesting to take different questions from different people over time, and have those results rated for quality. Still, if you’re not getting a source, it becomes very, very difficult to do verification. You would need to find the fact that needs verification on a website or in a journal.

PB: If people are trusting social media or generative AI because the curve of truth or reproducibility, or validity in any set of responses, is opaque to them, then they don’t know when and where to exercise the time and judgment to validate. And that’s a hard thing to ask of users, generally.

GP: This is why it needs to be taught or discussed.

VE: To summarize key themes here - don’t trust, verify. And maybe have AI itself write this article from our conversation?

GP: Yes, and I’d be curious to see the results of having the transcript fed through different GPT models to generate an article.


Generative AI for Scholarly Information Access

Many in the information seeking community are excited about the promise of large language models and Generative AI to improve scholarly information access. These models can quickly transform the content of scholarly works in ways that can make them more approachable, digestible, and suitably written for audiences for whom the works may not have been originally intended. However, the current technical implementation of Generative AI can limit their utility in these settings. Issues of hallucination (models generating false or misleading information) or bias propagation are still common, making it difficult to recommend these technologies for critical tasks. Dominant paradigms for addressing these issues and achieving alignment between AI and human values can also cause a reduction in the diversity of output, which can lead to information censorship for stigmatized topics, going against the goal of broad access to high-quality information. In this essay, I discuss the promises of AI for improving access to scholarly content, how current practices in Generative AI training may lead to undesirable and possibly unintended consequences, and how libraries and other community organizations could place themselves at the forefront of solutions for improving the individual and community relevance of these technologies.

The Promises of AI for Scholarly Content Creation and Understanding

Most scholarly content is written for other scholars. These texts make heavy use of technical jargon and assume a high level of background knowledge and domain expertise, raising barriers to reading and understanding. Engaging with these works can be difficult even from within their scholarly subcommunities, not to mention trying to do so from outside the academic sphere. Yet, the goal of many scholarly communities is to produce results and insights that can help improve individual and societal well-being. For example, while clinical trial reports for new medications are published in clinical journals, the people that stand to gain the most from promising results are patients and their caregivers.

Generative AI, such as large language models (LLMs) and text-to-vision models, have the ability to quickly transform the content of scholarly works, changing the language, tone, form, and presentation of these works to make them more approachable, understandable, and engaging for different audiences. In recent work, we have shown that simplifying text using language models can make difficult information more digestible (August et al., 2022); changing the form and presentation of material can reduce barriers to engagement (Shin et al., 2024); and synthesizing information across many papers can simplify the process of reviewing the literature (Giorgi et al., 2024). LLMs can also help with content creation, especially in cases where scholars lack the time and knowledge to easily perform this work without training. For example, they have been used successfully to help review papers (Zyska et al., 2023; D’Arcy et al., 2024) and describe complex figures to blind and low vision readers (Singh et al., 2024). When assisting users

in writing, these models can reduce the time needed to produce high-quality content, while emphasizing the role of human authors in verifying that generated text accurately reflects their original intent.

Realizing the potential of Generative AI technologies, however, requires deeper understanding of individual and community-specific factors that influence scholarly information access, and the interplay between these factors and the design limitations of Generative AI systems. A specific tension is — we know that Generative AI can produce false or misleading output, which is a deal-breaker in scientific settings. They can also produce toxic or biased output, which may perpetuate social biases and would warn against their use for critical decisionmaking tasks. Current mitigation strategies for these negative behaviors, on the other hand, can also lead to homogenization of output, which may not support diverse community needs. This can manifest more when searching for information on stigmatized topics such as mental illness, reproductive health, or disability, where access to high-quality information is already difficult. So while these systems have the potential to improve information access, we must balance the various sides of this tension (accuracy, bias mitigation, and individual relevance) to achieve the promise of these technologies for end users.

Technical Limitations of Generative AI

Most popular LLMs are accessed through a chat interface, with users prompting the model in human language to provide a response or perform some task. Compared to classic search engines, this setup introduces new points of friction. First, LLMs compress information across many sources without attributing output to any specific source (or at least, without any guarantee of the correct source). At the same time, the tone of LLMgenerated content tends to be official-sounding and confident. Together, these features reduce our ability to judge credibility based on previously reliable heuristics such as language quality or the trustworthiness of specific information sources. In a scholarly communication setting, false information and the inability to judge the veracity of information are unacceptable outcomes, since misleading content can pollute the scholarly record and reduce the value and trustworthiness of the entire scholarly enterprise.

Something else causing problems is unrepresentative training data. High quality data is essential to achieving good model performance, yet most LLMs are trained on mixtures of text and images scraped from the web, which is not representative of human society. Much of the toxicity and bias in AI output can be attributed to the presence of such issues within the training data (Zhao et al., 2017; Dodge et al., 2021; Buschek and Thorp, [n.d.]). Previous work has shown how marginalized groups tend to receive biased treatment from these models in terms of less equitable representation (Ghosh and Caliskan, 2023; Zack et al., 2023), higher rates of flagging in content moderation (Sap et al., 2019; Davidson et al., 2019), and more. While companies scramble to acquire better training data, 1 reinforcement


learning with human feedback (RLHF) has become the dominant paradigm for mitigating such issues from the modeling side (Glaese et al., 2022; Bai et al., 2022; Wu et al., 2023).

RLHF methods use human-labeled preferences between different answer choices to further train AI models, with the goal of achieving closer alignment with human values and expectations. While effective at reducing some forms of toxic and biased output, these methods have limitations. RLHF as currently implemented assumes the existence of a homogeneous shared set of human values that models can learn and optimize for, while we know that communities are diverse in their beliefs and needs (Kirk et al., 2023). The diversity of model responses for models trained with RLHF has been observed to be lower (Padmakumar and He, 2023). This homogenization of output (Anderson et al., 2024) can cause a tendency to prioritize “safe” answers about “normalized” topics, which reduces a model’s ability to provide accurate and actionable answers for longtail or stigmatized topics (Oliva et al., 2020; Gadiraju et al., 2023). The potential for harm from additional information censorship is high, since people already face difficulties accessing high-quality information about these topics. In a scholarly communication setting, this may extend to a likelihood for AI to perpetuate the status quo, recommending work and findings from only the most “canonical” scholars and institutions.

Improving the Community Relevance of Generative AI

Community-oriented Generative AI must be aware of the needs and challenges of individuals and adapt to better serve those needs. While current LLMs are hampered by the limitations I discussed previously, there are several promising developments, such as the increasing availability of open-source LLMs, more awareness around the need for representative training data, and techniques to adapt LLMs to different domains through retrieval augmentation or additional models (perhaps maintained and governed at the local level). I focus here on the retrieval-augmentation paradigm and how this may be a viable and appropriate way to adapt language models to the needs of individual communities. Specifically, it is a paradigm that works well with the existing nature of libraries as community repositories and curators of knowledge.

The Retrieval-augmentation Paradigm

To offset issues of hallucination, researchers have proposed retrieval augmentation as a way to encourage more faithful, attributable, and accurate model output (Lewis et al., 2020). Instead of relying on a model’s parametric knowledge (what it learned during training), the retrieval augmented generation (RAG) paradigm acknowledges that training data is limited, and proposes to augment the model at the time of use with additional information. Models can then combine retrieved information with their parametric knowledge to produce more useful outputs. This can address cases where the information needed to provide an answer is missing from the model’s training data or simply out of date. Additionally, because we know the origin of the retrieved information, we can attribute model-generated content directly to primary and/or secondary sources, which makes it easier to assess the credibility of the model’s statements. As an example of how this might work: take the FDA’s recent March approval of a new drug for treating resistant hypertension, aprocitentan.2 A model trained on data before this time may be able to infer that approval is likely, based on prior publications documenting the drug’s effectiveness in clinical trials (Schlaich et al., 2022), but

it cannot be certain. In cases like these, retrieval augmentation would bridge the gap in knowledge and allow us to identify new publications — e.g., news articles, reports, press releases etc. — documenting the approval.

Using Retrieval Augmentation to Achieve Community Relevance

Libraries are community organizations with deep knowledge of their patrons and their needs, and have the power to acquire print, digital, and other resources to serve those needs. We should be leveraging the role of libraries as curators of information resources as a bridge between their communities and Generative AI technologies. AI and LLMs should be able to access content on behalf of the members of the communities they serve, through retrieval augmentation, in order to produce more relevant and useful outputs.

How to manage and implement this type of access is an open question. To make physical or subscription content usable for retrieval augmentation, these materials must be digitized and parsed into machine-readable units of text. If shouldered by individual libraries or community archives, the time and resource costs would be untenable, in addition to producing lots of redundant work. Crafting a centralized digitization repository is more straightforward for open access publications — though issues of funding and maintenance would still need to be addressed. But even with all the efforts made in that space in recent years, a majority of scholarly materials still require fee-based or subscription access. Publishers managing closed access materials would need to push such content to different institutions based on the terms of their subscription agreements. After access is obtained, local instantiations of LLMs could be hooked up to what is institutionally accessible as an additional and unique-to-that-community source of data for retrieval augmentation. Current standards for data exchange3 would be sufficient for the transfer of scholarly works between centralized and local databases, but additional clauses would be necessary to communicate appropriate use of the content by language models (i.e., consumption, attribution, distribution, adaptation), the scope and duration of use, as well as data transformations to enable efficient and reliable retrieval.

Organizations already positioned as community repositories of knowledge should lead the charge. These are suitable places for the deployment of Generative AI systems adapted to the local community, as well as for educating those communities around appropriate uses and limitations of AI. While major contenders in Generative AI development race to make their models more powerful and performant, libraries and community organizations should focus on creating adaptations and connections to make these systems usable for their patrons. If taking the approach of adapting a general-purpose AI model, an organization would need to create the data infrastructure to make information resources available to these models, while maintaining the ability to define the scope of use for each resource.

An alternate approach might be to develop models of one’s own, trained for specific tasks and goals using the available data. While this seems like a weighty ask, the difficulty of training and maintaining such a model is dropping as these technologies mature; and this option will become much more feasible over time. As for why this is an attractive approach, in many cases researchers have demonstrated that training smaller models using curated training data can lead to better or comparable performance when compared to adapting a larger general-purpose model for the same tasks (Li et al., 2023; Jiang


et al., 2023). Taking such an approach allows an organization to maintain control over the goals of these technologies, define the scope of abilities based on community needs, and address shortcomings quickly, while freeing the organization from the whims and decisions of distant and unresponsive tech giants.


Barrett R. Anderson, Jash Hemant Shah, and Max Kreminski. 2024. Homogenization Effects of Large Language Models on Human Creative Ideation. ArXiv abs/2402.01536 (2024).

Tal August, Lucy Lu Wang, Jonathan Bragg, Marti A. Hearst, Andrew Head, and Kyle Lo. 2022. Paper Plain: Making Medical Research Papers Approachable to Healthcare Consumers with Natural Language Processing. ACM Transactions on ComputerHuman Interaction 30 (2022), 1 – 38.

Yuntao Bai, Andy Jones, Kamal Ndousse, Amanda Askell, Anna Chen, Nova DasSarma, et al. 2022. Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback. ArXiv abs/2204.05862 (2022).

Christo Buschek and Jer Thorp. [n.d.]. Models All the Way Down. ([n. d.]).

Mike D’Arcy, Tom Hope, Larry Birnbaum, and Doug Downey. 2024. MARG: Multi-Agent Review Generation for Scientific Papers. ArXiv abs/2401.04259 (2024).

Thomas Davidson, Debasmita Bhattacharya, and Ingmar Weber. 2019. Racial Bias in Hate Speech and Abusive Language Detection Datasets. ArXiv abs/1905.12516 (2019).

Jesse Dodge, Ana Marasovic, Gabriel Ilharco, Dirk Groeneveld, Margaret Mitchell, and Matt Gardner. 2021. Documenting Large Webtext Corpora: A Case Study on the Colossal Clean Crawled Corpus. In Conference on Empirical Methods in Natural Language Processing.

Vinitha Gadiraju, Shaun K. Kane, Sunipa Dev, Alex S Taylor, Ding Wang, Emily Denton, and Robin N. Brewer. 2023. ”I wouldn’t say offensive but…”: Disability-Centered Perspectives on Large Language Models. Proceedings of the 2023 ACM Conference on Fairness, Accountability, and Transparency (2023).

Sourojit Ghosh and Aylin Caliskan. 2023. ChatGPT Perpetuates Gender Bias in Machine Translation and Ignores Non-Gendered Pronouns: Findings across Bengali and Five other Low-Resource Languages. Proceedings of the 2023 AAAI/ACM Conference on AI, Ethics, and Society (2023).

John Giorgi, Amanpreet Singh, Doug Downey, Sergey Feldman, and Lucy Lu Wang. 2024. TOPICAL: Topic Pages Automagically. NAACL System Demonstrations (2024).

Amelia Glaese, Nathan McAleese, Maja Trkebacz, John Aslanides, Vlad Firoiu, Timo Ewalds, et al. 2022. Improving alignment of dialogue agents via targeted human judgements. ArXiv abs/2209.14375 (2022).

Albert Qiaochu Jiang, Alexandre Sablayrolles, Arthur Mensch, Chris Bamford, Devendra Singh Chaplot, Diego de Las Casas, Florian Bressand, Gianna Lengyel, Guillaume Lample, Lucile Saulnier, L’elio Renard Lavaud, Marie-Anne Lachaux, Pierre Stock, Teven Le Scao, Thibaut Lavril, Thomas Wang, Timothee Lacroix, and´ William El Sayed. 2023. Mistral 7B. ArXiv abs/2310.06825 (2023).

Hannah Rose Kirk, Andrew M. Bean, Bertie Vidgen, Paul Rottger, and Scott A. Hale. 2023. The Past, Present¨ and Better Future of Feedback Learning in Large Language Models for

Subjective Human Preferences and Values. ArXiv abs/2310.07629 (2023).

Patrick Lewis, Ethan Perez, Aleksandara Piktus, Fabio Petroni, Vladimir Karpukhin, Naman Goyal, Heinrich Kuttler, Mike Lewis, Wen tau Yih, Tim Rocktaschel, Sebastian Riedel, and Douwe Kiela. 2020. Retrieval-Augmented Generation for KnowledgeIntensive NLP Tasks. NeurIPS abs/2005.11401 (2020).

Yuan-Fang Li, Sebastien Bubeck, Ronen Eldan, Allison Del Giorno, Suriya Gunasekar, and Yin Tat Lee. 2023. Textbooks Are All You Need II: phi-1.5 technical report. ArXiv abs/2309.05463 (2023).

Thiago Dias Oliva, Dennys Marcelo Antonialli, and Alessandra Gomes. 2020. Fighting Hate Speech, Silencing Drag Queens? Artificial Intelligence in Content Moderation and Risks to LGBTQ Voices Online. Sexuality & Culture 25 (2020), 700 – 732.

Vishakh Padmakumar and He He. 2023. Does Writing with Language Models Reduce Content Diversity? ArXiv abs/2309.05196 (2023).

Maarten Sap, Dallas Card, Saadia Gabriel, Yejin Choi, and Noah A. Smith. 2019. The Risk of Racial Bias in Hate Speech Detection. In Annual Meeting of the Association for Computational Linguistics

Markus P. Schlaich, Marc Bellet, Michael A. Weber, Parisa Danaietash, George L. Bakris, John M. Flack, et al. 2022. Dual endothelin antagonist aprocitentan for resistant hypertension (PRECISION): a multicentre, blinded, randomised, parallelgroup, phase 3 trial. The Lancet 400 (2022), 1927–1937.

Donghoon Shin, Lucy Lu Wang, and Gary Hsieh. 2024. From Paper to Card: Transforming Design Implications with Generative AI. In Proceedings of the 2024 ACM CHI conference on Human Factors in Computing Systems.

Nikhil Singh, Lucy Lu Wang, and Jonathan Bragg. 2024. FigurA11y: AI Assistance for Scientific Alt Text Writing. In Proceedings of the 2024 ACM International Conference on Intelligent User Interfaces (IUI).

Zeqiu Wu, Yushi Hu, Weijia Shi, Nouha Dziri, Alane Suhr, Prithviraj Ammanabrolu, Noah A. Smith, Mari Ostendorf, and Hannaneh Hajishirzi. 2023. Fine-Grained Human Feedback Gives Better Rewards for Language Model Training. ArXiv abs/2306.01693 (2023).

Travis Zack, Eric P. Lehman, Mirac Suzgun, Jorge Alberto Rodriguez, Leo Anthony Celi, Judy Gichoya, Daniel Jurafsky, Peter Szolovits, D. Bates, E. Raja-Elie, Abdulnour, Atul Janardhan Butte, and Emily Alsentzer. 2023. Coding Inequity: Assessing GPT-4’s Potential for Perpetuating Racial and Gender Biases in Healthcare. In medRxiv.

Jieyu Zhao, Tianlu Wang, Mark Yatskar, Vicente Ordonez, and Kai-Wei Chang. 2017. Men Also Like Shopping: Reducing Gender Bias Amplification using Corpus-level Constraints. In Conference on Empirical Methods in Natural Language Processing

Dennis Zyska, Nils Dycke, Jan Buchmann, Ilia Kuznetsov, and Iryna Gurevych. 2023. CARE: Collaborative AI-Assisted Reading Environment. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations), Danushka Bollegala, Ruihong Huang, and Alan Ritter (Eds.). Association for Computational Linguistics, Toronto, Canada, 291–303.

endnotes on page 24


Libraries, AI and Training Collections

One reason we are sensitive about genAI (generative AI) is because knowledge and language are central to our sense of ourselves and to the institutions which are important to us. Accordingly, application of genAI raises major questions across the cultural, scholarly and civic contexts that are important to libraries. In this context, I like Alison Gopnik’s characterization of genAI as a cultural technology. Cultural technologies, in her terms, provide ways of communicating information between groups of people. Examples, she suggests, are “writing, print, libraries, internet search engines or even language itself.” She goes on to argue that asking whether an LLM is intelligent or knows about the world is “like asking whether the University of California’s library is intelligent or whether a Google search ‘knows’ the answer to your questions.” She reminds us that previous cultural technologies also caused concerns, referencing Socrates’ thoughts about the effect of writing on our ability to remember and as a potential spreader of misinformation. And she also notes that past cultural technologies “required new norms, rules, laws and institutions to make sure that the good outweighs the ill, from shaming liars and honoring truth-tellers to inventing fact-checkers, librarians, libel laws and privacy regulations.”

GenAI is in very early stages. It is a technology which operates on representations of human knowledge, on web pages, documents, images and other knowledge resources. It transforms such inputs into a statistical model which can be used to generate new representations in response to inputs. As such, it is a cultural technology which can not only be used to discover, summarise or otherwise operate with existing knowledge resources, but can be used to generate new resources. Such generation is imitative but is potentially significant in various ways.

Many expect it to become a broad-based cultural technology as an integral part of the systems and services which structure workflow, knowledge consumption and exchange, and commerce. If this is so, then we can expect to see evolution in several directions. We will see new “norms, rules, laws and institutions” which guide use. These are influenceable, and I argue elsewhere that library organizations should be pooling their influence and attention for strongest impact.

It also means that there will be increasing focus on the resources, the representations of knowledge, from which the statistical models that power the LLMs are created. The first generation of genAI, and the foundational LLMs that power the services from the major players, are based on large training collections opportunistically assembled from broad web scrapes, large content aggregations, and readily available knowledge resources (Common Crawl, Wikipedia, Project Gutenberg, …). In a delightfully presented discussion , Knowing Machines outlines how such LLMs are far from being a raw record of the available cultural record. The record is distorted by algorithmic manipulation at scale (what Knowing Machines calls “curation by statistics”), commercial and SEO interests in how web pages and

images are presented, and other artifacts of their construction, selection and processing.

Appraisal of knowledge resources will become progressively more important as the focus on mitigating bias, on quality of response, and on richness of content and context increases.

There is a double dynamic here, which is of central importance to those who have historically curated the scholarly and cultural record, including libraries. Existing providers of LLMs will want to use high quality knowledge resources to create more useful training data, to improve their models and the services based on them. We can see this at play in the discussions between AI companies and major providers of knowledge resources (e.g the New York Times or Reddit). Some of these discussions are being pursued in commercial negotiation, some are being pursued in the courts as questions over the boundaries of authorized use of content are litigated. The New York Times recently reported some of the ways in which the big providers are trying to extend the reach and raise the quality of materials in their training collections. OpenAI, for example, converted YouTube audio to text, and added the transcripts to the training collection of one of its models. Apparently, the NYT reports, Meta considered acquiring Simon & Schuster as a source of high-quality text.

Second, providers of knowledge resources will want to benefit from this latest cultural technology themselves, preferably in controlled ways. This may mean licensing data to the large players, creating LLMs themselves (if they have the scale), or participating in emerging cultural, scholarly or other missionaligned initiatives. Of course, there will be multiple possible service configurations. For example, several providers in our space are experimenting with retrieval augmented generation, combining searches in their own resources with the processing capacity of LLMs. Another example is use of the custom GPTs that OpenAI recently made available, where a provider can load their own data to work alongside GPT. The Dimensions GPT from Digital Science is an indicative example. Digital Science emphasizes trust through the “combination of scientific evidence and generative AI.”

I was tempted to call the representations that are used to train large language models “training libraries” here, to highlight that they are selected and can be curated to a greater or lesser degree. It seems to me that the use of the phrase “training data” does not signal the discretion in what is collected. It is more technocratic, somehow signaling that it is neutral, routine and under control. However, “collection” might be better, as it does not have the connotation of library, but does acknowledge the selective activity of collecting.

I recently discussed several initiatives to create specialist LLMs based on curated training collections of scholarly and cultural materials. The National Library of Sweden, for example, has created training collections based on Swedish cultural


materials. The Paul Allen Institute has developed a training collection, Dolma, based on various freely available collections of material to support its scholarly large language model, OLMo This includes data sets commonly used by others, but it also includes data from a large collection of open access articles (the Paul Allen Institute produces Semantic Scholar). I also noted how Bloomberg, Forbes and others were leveraging their archives.

A particularly interesting case is provided by the BBC, whose historical archives represent a deep cultural and social resource of potentially major interest. They are cautiously discussing the addition of this resource to the training collections used by the large technology firms developing foundation models. However, they are also looking at creating their own large language models for internal use. This dual approach may become more common, as organizations consider how best to participate in this emerging cultural technology.

Here, I want to focus briefly on two potential training collections related to library interests. The first involves knowledge resources created or controlled by the library; the second relates to the scholarly literature, where the library is a partner and a consumer. My perspective is prospective.

Libraries, archives and related organizations have invested major intellectual effort in several areas over many years. In each case, there is significant contextual or relationship data which could be amenable to this latest cultural technology processing. Here are three examples. First is LibGuides. The aggregated LibGuides collection contains many resources contextualized by subject, course and other attributes. I will be interested to see whether Springshare, who provide LibGuides, works with the contributing libraries to see what might be possible. Is this a deep enough resource to be of interest? I am not sure, but it would be interesting to know what services based on it look like, whether this is in conjunction with GPT (or another LLM) or not.

A second is around bibliographic infrastructure and associated resources. OCLC has invested in a linked data infrastructure around WorldCat and associated vocabularies. HathiTrust curates a large collection of full texts, which are described by WorldCat. What would a library built from WorldCat and HathiTrust be like? How useful would it be? What about WorldCat, HathiTrust and Jstor? This is unlikely to happen without some major external stimulus, and it is unclear what the organizational context of output would be, but it seems like it is worth a discussion? An interesting question in this context is whether Google books is already mobilized within Gemini, the Google LLM?

The third area is maybe the most intriguing as it is the least replicable elsewhere. In aggregate, libraries and archives have a massive collection of finding aids, rich contextual, hierarchical data. There are some extant aggregations of finding aids and related data in Calames, in France, for example, or in the Archives Hub and ArchiveGrid in the UK and US respectively. There is also a range of regional and other aggregations, such as Archives West, for example. This data is not always highly structured. And of course, it is very far from complete, at the institutional level and at the aggregate level. But thinking about a large aggregation of finding aids and related data as a training collection extends the ongoing discussion about discovery. For example, a recent major collaborative study, NAFAN (Building a National Finding Aid Network), explored the feasibility of a national US approach to archival discovery, understanding the gaps in coverage and other issues.

To be very clear, I am not suggesting the AI-assisted creation of finding aids here; my interest is in training collections and LLMs. I am thinking about amplifying the existing discussion about how their existence is effectively disclosed and how the intellectual investment in description is mobilized. How to make them more valuable. Of course doing something like this would be a major undertaking, not least because of the major collective action problem it poses. While many people might like to see this, it is difficult to individually advance it. The conjunction of an existing archival discovery question with the possibility of AI amplification, however, does rise to the level of a community grand challenge, which poses three initial questions. The first is where to secure the funding to get something like this off the ground. The second is one of agency, both in any startup phase but also from an operations and business perspective going forward. The third is securing post start-up funding. As a community, we are not good at sustaining large scale infrastructure as a community asset. While one could imagine some income from licensing a resource like this to the foundational LLM providers, it is unlikely to sustain an operation over time. Again, I think that a discussion about this might be advanced by several key national organizations. While individual libraries will have materials harvested by the LLM providers, it would also make sense to think about how one would advance a community discussion about infrastructure and organized disclosure. The NAFAN partners may be a basis for some discussion, but the potential interest goes beyond this to other organizations and funders.

My second example is less speculative; indeed components of it are already in place. In recent years, we have seen the emergence of three organizations which I have called “scholarly communication service providers.” These are Elsevier, Digital Science (a part of Holtzbrinck Publishing Group, also the majority owner of Springer Nature), and Clarivate. These organizations share several characteristics. They have the scale and expertise to do work in this space. They curate major reservoirs of scientific literature, representing a significant part of the scientific record. However, their interests are not just in the published record, they are interested in a full characterization of the scientific enterprise. To that end, they also curate a variety of other entities (people, for example, institutions, funding, methods, and so on) and their relationships. Each has made a significant investment in building out a research graph, manifested in various products. They have also built a range of workflow services, supporting researcher behaviors and the creation and management of data about these research entities. They provide research and university analytics services, leveraging their data assets to provide intelligence to universities and departments about their own research performance and how it compares with others. Each has also been making announcements about AI adoption and partnership.

There are other players here, notably the Paul Allen Institute, which I have already mentioned, which has the expertise, connections and pockets necessary to make an impact.

However, the three organizations above are especially interesting in our space given the range of their involvements and the variety of ways in which the scholarly community, including libraries, already relies on their services. Clearly, we are seeing some incremental changes as AI gets incorporated in discovery and related services. This will continue. We are likely to see summarization, literature reviews, recommended reading or relevant areas of study. It will be interesting to see what sort of analytics services emerge. And beyond that there is clearly scope for new services.


The scale, data and reach that these three organizations have mean that they will be able to create important new services, based on rich training collections of bibliographic, profile, and other data. This raises some questions for libraries and universities. One is the extent to which they are willing to have the data from their workflow solutions (e.g., research information management systems, research data management systems, library systems) flow into the training collections. A second is around trust in black boxed outputs, not only in discovery contexts but in decision-making contexts. That said, these companies share a compelling interest in developing approaches which inspire confidence in outputs. However, it also raises the prospect of interacting with scientific knowledge in different ways, facilitated by new cultural technologies. This potentially shifts the open access debate in important ways. The focus of the open access discussion has been on the article, the journal and the package. If an important element of access to scholarly knowledge is through new genAI-enabled services, then who has access to them and under what conditions becomes a central question. The open access discussion shifts to the LLM or the AI-enabled configuration.

GenAI appears to be an important cultural technology. Emerging training collections are core to its operation. The large foundation models will continue to grow, alongside more specialist resources which might be used in various configurations. Transparency in the construction of training collections has emerged as an important consideration. Understanding and explaining the construction of training collections will be an important part of what libraries might do, especially where they are licensing services based upon them or seeing data from their institutions go into them. However, it would be good to be more actively involved in this important cultural technology. This poses a collective action question, which makes it critical for those who act in the collective library interest to develop frameworks in which libraries can work together.

Generative AI for Scholarly Information Access continued from page 21


1. Sometimes by bending or possibly breaking rules as discussed in this NYT article


3. Such as TEI XML or JATS XML.


Some Next Generation Legal Issues for Training Generative Technologies


Generative technologies 1 like large language models and diffusion models can generate startlingly advanced textual and visual creations based on relatively minimal human prompting. Versions of these technologies have been around for years, but the high-profile release in the last year or two of some especially powerful consumer-facing tools — ChatGPT and Stable Diffusion, in particular — created a stir in the popular imagination. They also prompted a storm of litigation as copyright holders sought to block or license the use of in-copyright materials to “train” these technologies as well as any allegedly infringing works they may create. For their part, technology companies have already collected and exploited massive amounts of data from the open internet without asking permission, and their thirst for data has them looking at whether and how they can find more training data.2

Current litigation will likely provide useful answers to the two threshold questions that every “what’s up with AI and copyright?” explainer must inevitably cover:

1) Are the creators of LLM and diffusion models within their fair use rights (more specifically within the line of fair use case law that permits computer processing of in-copyright works for “transformative” purposes) when they use in-copyright works as “inputs” to train these models?

2) Under what circumstances do the “outputs” of these models infringe on the copyrights of existing works?

The most likely answers are “Generally, yes,” and “The same circumstances as when a human author’s work is infringing, i.e., when the output is substantially similar to an in-copyright work.” Judges are human, of course, and juries even more so; they can zig when you expect them to zag. But my interest here is in what happens next. Assuming we get those most likely answers or something close enough that this technology continues to develop without major copyright constraints, what follows is a brief tour of what we might call the next generation of legal questions and challenges for training generative technologies. Fundamentally, these are questions about whether and how creators and controllers of data (understood broadly to mean any kind of information that could be used in training, not just, and not typically, numerical data) can block, monetize, or otherwise shape the use of that data in training generative models. After exploring some of the strategies they might try under current law, I’ll discuss briefly whether any of this is good policy.

After Copyright – Publishers’ Legal Tools for Controlling, Blocking, and Monetizing Generative Technology Training

Creators and (more often) aggregators and monetizers of content that could be used in the development of generative technologies have a variety of tools at their disposal to control

access and reuse. Some of these tools are self-help, but most have an element of legal enforceability, creating at least a risk of liability for those who ignore, circumvent, or defy them. Generative technology developers are likely to run into each of these in their search for data to refine and improve their tools.

Scraping publicly accessible websites is a popular way to gather training data, but a tangle of potentially overlapping legal provisions awaits the would-be scraper. Recent litigation seems to have taken the federal anti-hacking statute (the Computer Fraud and Abuse Act) mostly off the table; this is welcome news, but its import has been wildly exaggerated. As one expert has quipped, “If you’re looking for legal issues, web scraping is a hornet’s nest on top of a beehive on top of a wasp’s nest resting on an anthill with scorpions.”3 Breach of contract, trespass to chattels (the equivalent of trespassing on private land, but for other property like servers), and any number of state law tort claims (civil claims that can arise when someone’s wrongful action causes harm to another, including to their reputation, privacy, or livelihood) are all still live possibilities. If “training” generative tools is held to be fair use, technology developers may argue that federal copyright law preempts these other legal claims, but the success of that argument will depend on the facts in a specific case.

Beyond the open web lies the vast sea of content kept behind a paywall, typically subject to express agreement (i.e., the user has to affirmatively acknowledge, truthfully or not, that they have read, understood, and agreed) with terms of use as a condition of access. Commercial STEM publishers, increasingly many news outlets, and commercial streaming platforms will all fit into this category. Terms of use on open websites may be subject to challenge on the grounds that users are not on notice of their terms, do not give consent, or both; terms imposed on paywalled content are not likely to be vulnerable to such challenges. Universities have already drawn attention to major publishers like Lexis adding or highlighting “no AI” language in their licenses and user agreements.4 Whether such provisions can override copyright limitations and exceptions like fair use is an unsettled question in the law, and one with increasingly broad and pressing implications for libraries and other research institutions.5 The impact of contracts on the otherwise lawful development and use of generative technologies can be added to the list of concerns.

Two other forms of rightsholder “self-help” got a boost in the late 1990s from the Digital Millennium Copyright Act: so-called “technological protection measures” (TPMs) and “copyright management information” (CMI). TPMs are like digital locks placed on copies of works to control access and use; examples include encryption and the use of authentication servers. CMI includes watermarks and digital fingerprints that identify the copyright owner of a work. TPMs and CMI may enable rightsholders to prevent or track use of their materials and removing these measures (which can be technically simple and


may even happen inadvertently) can trigger legal liability. A special triennial rulemaking hosted by the U.S. Copyright Office can grant exemptions that let certain users break TPMs for lawful purposes,6 and the Office has granted exemptions related to text and data mining,7 but there is not currently an exemption that would clearly permit circumvention of TPMs for the purpose of training generative technologies. And since a rulemaking cycle is currently underway, no such rule will exist before October 2027, at the earliest.

Some technology developers may not break locks or breach user agreements themselves, but they may use training data released by individuals or groups who did. Indeed, plaintiffs in some of the lawsuits regarding this technology allege that some major generative tools have been trained partially on data derived from so-called “shadow libraries” of in-copyright books made available online without the copyright holder’s permission — entities like Bibliotik, Library Genesis, or Z-Library. The “books3” dataset is the primary example of training data drawn from these sites, and journalists report that several major models have been trained with that data.8 This could be an issue for technology developers if the courts decide to consider the origin of the data as part of their fair use analysis, counting arguably ‘ill-gotten’ data as a sign of “bad faith.” The Supreme Court wrote in Harper & Row v. The Nation, Inc. that “fair use presumes good faith and fair dealing,” and penalized The Nation for relying on a “purloined manuscript” as the source of excerpts it published from a forthcoming book.9 Scholars and courts (including the Supreme Court itself in a later case10) have cast doubt on the validity of the “good faith” factor, but it lives on.11 If courts decide to give weight to the factor, the use of contracts and technical security measures could become an effective bar to fair use, even against users who do not breach contracts or circumvent digital locks.

Policy Implications: the End of Copyright?

Commentators sometimes suggest that digital technologies like these could spell “the end of copyright” if they can operate free of copyright constraints due to fair use. The previous section raises the specter of an opposite outcome: that the use of paywalls, passwords, contracts, and common law torts could spell the end of the centuries-long balance copyright strikes between the protection of expression on one hand and the free and fair circulation and reuse of facts, ideas, and information on the other.

This is not the first time that unfettered copying has made data publishers nervous. Decades before the Authors Guild v. Google case (in which the Second Circuit held that fair use permits copying millions of in-copyright books to create a search tool, among other things), the Supreme Court faced the question of whether and how to apply copyright to the wholesale copying of large collections of factual information. The publishers of factual databases argued for years that because of the effort involved in gathering, organizing, and publishing factual data (things like sports statistics, market performance data, and even address information), they should be entitled to copyright protection in the resulting databases. Under this theory, the first entity to gather and publish data could block others from extracting and sharing information from them without payment or permission. According to this theory, also known as the “industrious collection” or “sweat of the brow” doctrine, protection is appropriate to prevent “free riding” and ensure adequate incentives for the creation of new databases. Until 1991, U.S. courts were divided on this theory, but the Supreme Court settled the issue by rejecting “sweat of the brow” in Feist

v. Rural Telephone Co., an opinion about warring publishers of telephone directories.

After explaining that as a Constitutional matter, copyright only protects an author’s original expression, and not any facts or ideas that may be conveyed in that expression, Justice O’Connor explains the underlying policy for this treatment:

It may seem unfair that much of the fruit of the compiler’s labor may be used by others without compensation. As Justice Brennan has correctly observed, however, this is not “some unforeseen byproduct of a statutory scheme.” It is, rather, “the essence of copyright,” and a constitutional requirement. The primary objective of copyright is not to reward the labor of authors, but “[t]o promote the Progress of Science and useful Arts.” To this end, copyright assures authors the right to their original expression, but encourages others to build freely upon the ideas and information conveyed by a work. … This result is neither unfair nor unfortunate. It is the means by which copyright advances the progress of science and art.12

Justice Brandeis gave equally eloquent expression to this general principle in his dissent in Int’l News Service v. Associated Press: “The general rule of law is that the noblest of human productions — knowledge, truths ascertained, conceptions, and ideas — became, after voluntary communication to others, free as the air to common use.”13

This principle — that copyright protects the author against unfair circulation of her expression, not of the facts and ideas it conveys — explains why copyright cannot (and should not) protect rights holders from some kinds of competition by generative technologies. For example, a generative tool whose training included books about Franklin D. Roosevelt might be able to answer a factual question like “When was Franklin D. Roosevelt stricken with polio?” No doubt its ability to do so would be the result of the inclusion of this fact in some of the works in its training data, and for some users of the tool this answer will obviate the need to buy any of the works represented in the data. But, as Judge Leval explained in his opinion in the Google Books case, when a book search result (a “snippet” of a book’s text) reveals the answer to a factual question like this: what the searcher derived from the snippet was a historical fact. [The] [a]uthor[’s] … copyright does not extend to the facts communicated by his book. It protects only the author’s manner of expression.… [I]t would be a rare case in which the searcher’s interest in the protected aspect of the author’s work would be satisfied by what is available from snippet view.14

So if the user of a generative tool is satisfied by its reproduction of facts and information derived from a protected work, that’s not a substitution effect that copyright protection has ever or should ever prevent. A new biography of a major political or cultural figure may include significant, newlydiscovered, difficult-to-unearth facts that the author hopes will entice readers to buy the book. Nevertheless, the day it comes out, any diligent Wikipedia editor is free to transpose all those juicy facts over to the subject’s Wikipedia entry, making all those hard-won facts easily and freely available to all. Tabloids, social media personalities, and even the New York Times all benefit from this freedom to circulate, discuss, and build on facts and information revealed in others’ in-copyright works. Generative tools are only the latest mechanism to take advantage of the law’s recognition that facts and ideas should spread as freely as Thomas Jefferson’s famous candle flame.15


At the same time, generative tools will have to play by the same copyright rules as everyone else when they answer these factual questions — they cannot offer up answers that are substantially similar to the expressive elements of their training data. So, to the extent that the value of an author’s work derives from her writing, rather than her diligent collection of facts, that value is still protected by copyright.

The courts will likely rule that development of generative technologies is generally protected by fair use, in part because they provide the public with increased access to unprotected elements of in-copyright works — facts and ideas that have always been “free as air to common use.” If so, the protectionist legal tricks described above would be, if effective, nothing less than circumventions of the Constitutional design of copyright.

As this article went to press, the federal court in the Northern District of California issued an opinion that gives reason for hope


that copyright’s balance may survive the war over generative tech. In X Corp. v. Bright Data Ltd., 3:23-cv-0369816 (N.D. Cal. May 9, 2024), Judge Alsup rejected attempts by the social network formerly known as Twitter to block scraping of public data using several of the non-copyright theories canvassed above — breach of contract, trespass to chattels, and state competition law. The court found that none of these claims could block Bright Data from scraping public data from X’s website. Most importantly, the court found that federal copyright law preempts X’s attempt to “yank into its private domain and hold for sale information open to all.” The court also relied heavily on its finding that X could not show any cognizable harm from Bright’s scraping. In essence, the court ruled that if the only interest being vindicated by a legal claim is the interest in controlling the collection and reuse of information, that interest should be settled by copyright law. Here’s hoping other courts take note.

1. I use the term “generative technologies” as a bit of a provocation here in an attempt (in vain, too late, but still!) to avoid repeating terms like “artificial intelligence” and “machine learning” that can obscure as much as they reveal about these tools. See generally Emily Tucker, Artifice and Intelligence, Center on Privacy & Technology at Georgetown Law (Mar. 8, 2022), center-on-privacy-technology/artifice-and-intelligence%C2%B9-f00da128d3cd (last visited Jun 16, 2023)(characterizing “artificial intelligence” as “a phrase that now functions in the vernacular primarily to obfuscate, alienate, and glamorize.”).

2. See, e.g., Cade Metz, Cecilia Kang, Sheera Frenkel, Stuart A. Thompson and Nico Grant, How Tech Giants Cut Corners to Harvest Data for A.I., The New York Times, April 6, 2024,

3. Kieran McCarthy, Hello, You’ve Been Referred Here Because You’re Wrong About Web Scraping Laws (Guest Blog Post, Part 2 of 2), Technology & Marketing Law Blog (2022), (last visited Mar 13, 2023).

4. The Office of Scholarly Communication at the University of California, Berkeley, has provided important leadership in pushing back on this effort. See, e.g., Rachael Samberg, Tim Vollmer and Samantha Teremi, Fair use rights to conduct text and data mining and use artificial intelligence tools are essential for UC research and teaching, Mar. 12, 2024, https://osc.universityofcalifornia. edu/2024/03/fair-use-tdm-ai-restrictive-agreements/.

5. See generally Katherine Klosek, Copyright and Contracts: Issues and Strategies (2022), uploads/2022/07/Copyright-and-Contracts-Paper.pdf

6. The Copyright Office hosts a website with information about these rules, including current rules, an explanation of the rulemaking process, and an archive of materials related to past rulemakings:

7. The Authors Alliance led the effort to secure this exemption, and they provide helpful information about it. Authors Alliance, Update: Librarian Of Congress Grants 1201 Exemption To Enable Text Data Mining Research, Oct. 27, 2021, https://www.

8. Kyle Barr, Anti-Piracy Group Takes Massive AI Training Dataset “Books3” Offline, Gizmodo, Aug. 18, 2023, anti-piracy-group-takes-ai-training-dataset-books3-off-1850743763.

9. Harper & Row, Publishers, Inc., et al. v. Nation Enterprises, et al., 471 U.S. 539 (1985).

10. See Campbell v. Acuff-Rose, 510 U.S. 569, 585 n. 18 (1994) (noting range of opinions on relevance of “good faith,” including its complete rejection by Judge Pierre N. Leval in the same law review article the Court relied upon heavily for its fair use analysis elsewhere in the Campbell opinion, but not endorsing any position).

11. See generally Frankel, Simon and Kellogg, Matt, Bad Faith and Fair Use (September 1, 2012). Journal of the Copyright Society of the USA, Vol. 60, p. 1, 2013, Available at SSRN: I have written elsewhere about this intersection between “good faith” and training generative tools with “pirate data.” Brandon Butler, “Stolen Books,” Bad Faith, and Fair Use, Fair Use Week,

12. Harper & Row, 499 U.S. at 349-50 (internal citations omitted).

13. International News Service v. Associated Press, 248 U.S. 215, 250 (1918).

14. Authors Guild v. Google, Inc., 804 F.3d 202, 224 (2d Cir. 2015).

15. Letter from Thomas Jefferson to Isaac McPherson (Aug. 13, 1813), in 13 THE WRITINGS OF THOMAS JEFFERSON 326, 333–35 (Andrew A. Lipscomb ed., 1903) (“He who receives an idea from me, receives instruction himself without lessening mine; as he who lights his taper at mine, receives light without darkening me.”) Deliciously, recent scholarship reveals that Jefferson’s candle simile is itself a bit of light that he took almost verbatim from Cicero’s De Officiis. See Jeremy N. Sheff, Jefferson’s Taper, 73 SMU L. Rev. 299 (2020).



Visualising Collection Records with the Use of Generative AI

How Text-To-Image/Video Generative AI may help Libraries and Museums

Collections Accessibility

Libraries, Archives, Museums (LAM) have always found it challenging to improve the accessibility of their collections. With physical objects, institutions lack the physical space for public exposure. As digitisation technology became more affordable, and the Internet — widespread, there was hope that the accessibility issue will be resolved once and for all. Unfortunately, to quote Erik Salvaggio, we have now moved from the Age of Information to the Age of Noise.1 We are unable to find text, sounds, images or videos, all confused in a gigantic sea of data. If we are lucky to find them, that information is almost immediately lost again, overwhelmed by incessant streaming of new digital input.

The incredibly rapid progress of neural networks and machine learning technologies has now opened the possibility to first collect and scan, and then tag, classify, categorize and cluster not only textual documentation, but also images (and consequently videos, which are collections of frames).

The newest AI tools (in particular the Diffusion Model) — some of them even open sourced — seem able to recognize objects in images, with such accuracy that makes them worth being applied in the LAM sector. What’s more, once combined with Large Language Models, they can even describe whole scenes, actions, historical or environmental situations or photographic or artistic styles.

The “machine” can now help LAM institutions whose cataloguing tasks that have become humanly impossible. Applications have already been successfully developed by some archives and museums, like Australia’s national museum of screen culture, ACMI, and the National Film & Sound Archive, NFSA.

If embeddings and training on data sets (both words or pixel) allow us to retrieve images and videos much more efficiently, their generation with AI tools is the next immediate step. If the machine can “understand” what humans see in every single image, it can create a new image once we instruct it what we

Figure 1: Australian Museum Collection web access (


want to see. Because it cannot draw, it then reverses the imagereading process, morphing together data from the billions of images that have been used for its training. The machine can do that in such a sophisticated way, that text-to-video generative AI is triggering a “wow” reaction anytime we watch its “creations.”

This may generate another legitimate need for all libraries and museums visitors: in the future they may expect to access a visual representation of the archives’ and collections’ objects, instead of just scrolling through dense lists of database records (see figure 1). So would it be worth generating and publishing an image or (better yet) a short video clip to describe and contextualise any item in a collection? How much would that benefit the visitors of the institution and the cultural sector more broadly?

While the institutions will never be able to afford all the curators’ and media producers’ time for visually augmenting all the collection records, Generative AI may be able to automatically and quickly do the job, provided the output makes sense and is properly contextualised.

This will not happen if we just run an application that feeds a Large Language Model with the record metadata. It is a challenge very similar to what has already been experimented in LLM Chats, where there is a lack of reasoning abilities and the answers can be polluted with hallucinations, digressions and inconsistency, if relying only on the general training datasets and the inputted prompt.

Many are trying to solve the issue with LLM augmentation: usually fine-tuning or RAG (retrieval augmented generation), but more techniques, or combination of them, are surfacing every week.

We might want then to apply the same contextualising techniques to record-to-video generation.

ACMI, Australia’s museum of screen culture, in Melbourne, Australia is an interesting study case.

They have a vast collection of items (donated, acquired, or inhouse produced), that covers a variety of screen culture genres: from home videos to videogames, from projection equipment to TV ads, from feature films to historical documentaries. Their main challenge is to ensure online accessibility to those objects, for an audience of ACMI visitors, researchers, and, in general, screen culture aficionados.

ACMI has been quite innovative in exploring and implementing AI technology for improving the searchability of their collections. For instance, they have been audio captioning all the published videos using Whisper2 and then search-indexing the text for quick retrieval. Then they have augmented the Video Search feature with an object-detection tool like BLIP 2, which is also coupled to a Large Language Model for captioning the frames.3 Finally, they are also vectorialising all collection metadata and using the neural algorithm for discovering peculiar connections among the collection items.4

However, the ACMI Collection also includes undigitized video material. These items are still retrievable via the website, but they are not accompanied by any images, trailers, teasers or any kind of visual description.

I have chosen a couple of examples, because I was interested to see if a free generative AI tool with all default settings, and prompting only the metadata associated with the records, can give us a glimpse of the potentiality in using this technology for visualising collection items with minimum effort.

Later, I will examine how we can contextualise those examples, augmenting the generative AI processing.

The first item is retrievable from the ACMI website5 and it’s a 1937 U.S. educational film about the building construction craft and industry. Only a descriptive text and some metadata are published, and there is a note saying that, unfortunately, ACMI doesn’t have any image or video for this film (which is 16mm, not digitised).

For my experiment, I have chosen the Gen-2-Runaway multimodal system,6 because of its popularity, friendly interface and processing speed.

As a first step, I generated few images, just prompting the content from the ACMI website item page. As everyone knows, “a picture is worth a thousand words” and this is true also for an AI generated image. But the image must be right on, it must really summarise all in one shot: what is it about, why the object is there, how did it get there, how is it connected, how it affects our cultures and environments. This is a very complicated task for an AI that can only reconstruct an image starting with clusters of pixels associated with words or groups of words.

I’ve added below one of the most interesting images, where I find some appealing details (the black&white granularity of a 1930s film, the row of dwellings with reference to the construction industry, the tracks suggesting the role of transportation). However, there are clearly some distortions, if not even hallucinations, and the general feeling of a lack of context.

Figure 2: Generated by Gen 2 – Runway, prompting the content of

I used the same prompt for generating a very short video on the same Gen 2 – runway platform. The video may be more entertaining and attractive than a single image (and more suitable for a “film” item). A video, which can be also associated with audio and captions, can narrate a “story.” Because it is a correlated sequence of a much greater number of images, it is capable of describing the object from multiple perspectives: its history, location, usage, technical details, relation to human society, expanding even to visit or access information. A lot that can be communicated in a 10 second video clip.


In this case, however, the video example that I’ve generated carries the same flaws that we have noticed in the image: the historical period is definitely there, a house and a car as well, but the meaning of the scene is unknown.

Figure 3: Generated by Gen 2 – Runway, prompting the content of

It’s exciting to note that, both for images and videos, although with the limits of an uncontextualized prompt, there are already elements showing the potentiality of an AI generated visualisation, without the need of any post-production manual editing.

The second example is another U.S. educational film,7 this time about immigration, and again I generated a few initial images (below an example)

Figure 4: Generated by Gen 2 – Runway, prompting the content of

and then a short video.

Figure 5: Generated by Gen 2 – Runway, prompting the content of

There are some positive clues here: the post-war atmosphere, people’s clothes, crowds gathering or queuing or walking single file to work, even a glimpse of the film colouring at that time. It means there are, although vague, references to year, location, topic and media type.

But again, looking at the video, apart from the typical textto-video AI distortions, the impression is that many frames are repetitive and consequently wasted. Also, because it is a single uninterrupted cinema-like scene, it might mislead the user and makes them think it is an excerpt from the movie itself.

Considering that the prompting I’ve been using was very basic, someone might even say “brutal,” there is certainly the possibility to improve the generated output with some additional contextualising information.

The first step may be the injection of a “system message.” In the study case of the ACMI museum, this could be something like:

“I am a film museum visitor. I am exploring the museum’s collection and I have found the film < title >, made in the year < year >, in < country >, with this description <description>. Make a very short video that describes this collection item, in the cultural context of its time and location.”

This system message can be coded and automatically applied to all films in the ACMI Collection.

The output may be still too generic, and we would like to give extra context, more specific to the institution where the item is preserved and catalogued. For instance, a number of documents that describe the strategic vision of the museum, the kind of exhibitions hosted by the museum, the kind of audience that visits the museum, or what kind of researchers are browsing the collections, maybe policies and regulatory guidelines for museum video production, among other things. All this documentation would be indexed and vectorialized, then scanned by search engines, that would in turn augment the prompting and increase the AI generator rational ability (like a RAG system8).


We could also apply some data training, if we have numerous collection items already equipped with proprietary, curated images or thumbnails. This would be the more resource intensive augmentation work, due to the high amount of labour and GPU hardware requested by fine-tuning any kind of generative AI. It would probably depend on the scale of the training data sets, the level of parametrization, the number of iterations, and so on.

Finally, we could force some video editing, with specific additional instructions: for example, “add initial 5 seconds with the description of the entire collection” or “add a tail of 5 seconds with the museum time visitations (or a list of the current exhibitions).”

Once set up, the platform would automatically process all records, potentially thousands or millions of them on ingest.

Below you can see a rough diagram of what the solution in its entirety would be:

That would force institutions to adopt mitigation actions, which may include:

• Using generative AI platforms trained exclusively on known, approved and authorised data sets

• Augmenting the prompting with safeguards against biases, ethical breaches, confrontational themes or unwanted topics

• Implementing automatic quality assurance tools, preventing the publication of AI hallucinations or digressions

We can see how many moving parts are still necessary for building a reliable, consistent and safe generative AI application. This makes the operationalisation of these kind of solutions even more challenging. Once the experimentation phase is concluded, and a prototype has been evaluated, tested, and finally approved, then there are numerous issues that operations managers need to tackle.

This augmented text-to-video generation would not be exempted from the well-known challenges in adopting AI. Cultural institutions, especially in the GLAM sector, are sometimes questioned for the choices made by their curators. The issue of public trust becomes more complex if the categorisation is decided by a machine, trained on partial, limited (and sometime illicit or morally unacceptable) data sets, with calculations made by a pre-determined, imperfect, algorithm. Copyright infringements, biases and unethical content are just some of the aspects of a more fundamental problem, and they are even more evident when manipulating images and videos.

These challenges will exist in this theoretical idea of recordto-video generation as well. There is no guarantee that the video would not contain frames that are questionable at the least, if not totally inappropriate. And not just for ethical and cultural standards, but also due to misalignment with the institution’s mission and values.

Some of them may include:

• Planning a continuous evaluation of the generative AI platforms, adopting the best available tool within the assigned budget;

• Defining image and video parameters based on consumption scenarios (e.g., output devices);

• Ensuring that all the documentation, data sets, templates, etc, are up-to-date and secure;

• Monitoring and evaluating performances and output quality (based on agreed metrics);

• Planning and implementing corrective actions in case of complaints or incidents.

It is often said in the LAM sector that operationalising AI applications is the most difficult part and, indeed quite rarely, so far, cultural institutions have made their experiments live and publicly accessible.

With the enormous investment in AI research, we don’t know what we’ll see over the next few months. There is vast potential for AI video generation to improve in every aspect, and the progress in optimising sustainability and compliance with ethics principles will determine its adoption in the cultural sector.

endnotes on page 34


Embracing Serendipitous Innovation in Libraries

If you’ve experimented with AI for specific tasks, like writing your journal article or helping you do your taxes, you’ve likely been underwhelmed by the initial results. However, approaching a chatbot for a conversation without a specific objective, armed only with curiosity and a playful attitude, might have inspired solutions for problems you didn’t even know you had or sparked ideas for approaching a problem space differently. This concept of serendipitous innovation unbounded by objectives isn’t new and didn’t arise with the advent of generative AI. We have all likely heard the stories of accidental discovery of inventions like the microwave or penicillin. Yet, the seemingly boundless array of applications of AI technology, particularly generative AI, promises a significantly greater potential for unexpected discoveries.

Despite the seemingly unpredictable and uncontrollable nature of serendipity, the topic has received a surge of recent attention in the field of management studies. In a literature review of serendipity in an organizational context, economics professor Christian Busch defines serendipity as a “surprising discovery that results from unplanned moments in which our decisions and actions lead to valuable outcomes.”1 Serendipity requires individual agency , an element of surprise , and a recognition of a discovery’s value. Through his synthesis of the literature, Busch posits that organizations can cultivate serendipity by creating an environment that encourages experimentation and collaboration and by communicating the organization’s problem area at a broad enough level of specificity that allows individuals to associate value to their discoveries.

In libraries, where technological innovation is often restricted to departments with access to grants and IT resources, AI’s potential for rapid ideation, coupled with the increasing accessibility of these tools to the consumer market, could be a gamechanger for materializing serendipitous discoveries across organizations. As libraries face the threat of drastic transformation through AI’s upending of search, discovery, and publishing, empowering more diverse groups of individuals to collaborate and experiment could accelerate the discovery of novel ideas for reimagining library services, data, and technology that will help users traverse this new information landscape. For highly siloed, objective-driven organizations, however, the shift needed to broaden the innovation space may be challenging to make. So how might libraries cultivate an environment where serendipitous discoveries are not only encouraged, but nourished and matured into library operations?

From my participation on the team building Indiana University’s Audiovisual Metadata Platform (AMP), I can share an example that illuminates how an openness to serendipitous discovery helped us reframe a problem space. The original hope of the AMP project was to extract metadata from digitized and previously undescribed audiovisual materials, but in testing AI tools for this purpose, we often found them falling short of our expectations. We hoped they would provide descriptive information useful for cataloging materials, but their poor accuracy and consistency made it challenging to confidently integrate them into the platform. However, through this process

of experimentation and testing, we serendipitously discovered other valuable functions they could serve. For example, In exploring human-readable outputs of video shot detection algorithms — which identify the start and end times of every shot in a video — we experimented with a contact sheet format that displays a thumbnail image from each shot, arranged in order on a grid. Although this tool did not seem to offer direct benefits for video cataloging, we found, through discussions with collection managers on the team, that the contact sheets could be extremely helpful to them for other tasks. This was especially true in environments where making video accessible to staff early in the processing pipeline is technically challenging. With the availability of machine-generated contact sheets, archivists processing newly digitized materials could quickly get a sense of what is contained on unmarked media or assess how much of a tape or reel was used for recording, without needing to scrub through the entire video. Copyright specialists could efficiently scan the contact sheets to identify any potential intellectual property or performance rights needing deeper review. Researchers, too, could benefit from visually skimming frames from video content that copyright restrictions prevented them from viewing remotely, helping them decide if a trip to the reading room was worth their time.

A “contact sheet” showing center frames from shots detected by a shot detection algorithm.

Soon, the team found ways to make other AI tools, which had performed poorly at their originally intended tasks, more useful by displaying their outputs in the contact sheet format. Video OCR, which transcribes text like signs or title screens from videos, was often inaccurate, and its JSON outputs were difficult to structure meaningfully. However, it excelled at identifying which frames in the video contained text, which could then be displayed as thumbnails on a contact sheet for


humans to manually transcribe. Facial recognition, fraught with flaws and potential for bias, could still be used to find frames with faces for human review on a contact sheet. The potential of the entire AMP platform, initially envisioned as a metadata creation pipeline that would input undescribed media and output basic catalog records with minimal human mediation and feedback, expanded to become a workbench for the full processing and description lifecycle, its primary strength not in replacing human labor and expertise but in supporting and enhancing it. This unexpected discovery and recognition of the value of contact sheet functionality for solving other problems contributed to our greater goal of improving the efficiency of audiovisual cataloging, just not in the ways we had anticipated.

Recent experiments supported and hosted by the Library of Congress can show us how AI experiments without narrow purposes in mind can provide tools to spark new discoveries. At the Table with: Mary Church Terrell, an experiment led by interns Ide Amari Thompson and Madeline Toombs, offers an interface for the random discovery of creative prompts extracted through natural language processing from the Library’s collection of writings by civil rights activist and suffragist Mary Church Terrell. While imagined as a method to support creatives in the generative process of writing, this alternative entryway into library collections sets a stage for limitless surprise and potentiality. Starting from prompts removed from their context, like “Let them not perish of heart hunger for want of the crumb of comfort which it is in our power to bestow,” and “Remove Mary’s adenoid,” users can then dive back into the source material to learn more. While the interface may prove difficult in supporting a traditional research approach, this AI-curated randomizer provides an experience ripe for serendipitous discovery of new research or creative methodologies.

Organizations often resist solutions in search of problems, aiming instead to prove their value by strategizing with goals and objectives and measuring success with actions and key performance indicators. This approach identifies many problems in need of solutions, with employees proving their worth by aligning their work with these goals, leaving little room for exploration beyond their assigned problem space. In our consulting work, my team is usually brought into organizations to address specific problems. Even the human-centered design process we use in our work focuses on defining the right problem to solve and then ideating possible solutions to it. So how might we awaken and harness the latent potential for serendipitous innovation in AI while still achieving our goals, meeting our users’ needs, and satisfying our stakeholders?

Despite its emphasis on solving the right problem, humancentered design (HCD), or design thinking, can lay the groundwork for fostering a culture of serendipitous innovation. HCD is a framework for people-focused experimentation and problem-solving, built on four principles: centering on people, solving the right problem, thinking in systems, and developing iteratively. Throughout its six-stage process — empathize, define, ideate, prototype, test, and implement — practitioners alternate between divergent thinking to explore possibilities and convergent thinking to narrow focus and direction. While applying the HCD process from start to finish on specific problems can be an effective method for generating solutions, exploring elements of the framework at the organizational level can help to set the stage for organization-wide AI experimentation.

AVP’s human-centered design process.

In the Empathize phase, we seek to understand the humans affected by a problem who will benefit from our solutions — their experiences, needs, desires, pains and challenges. If we consider the organization’s mission to be one broad problem area, we can use methods from the Empathize phase to better understand and communicate the breadth of needs of our users, communities, and staff (which could potentially be served by AI, amongst other solutions). If your organization has not yet developed a complex understanding of your users, their needs, and how you can serve them, now is a good time to start. Techniques used in the Empathize phase, like interviews, focus groups, shadowing, and persona development, can help to elicit more information not only about the goals your users are trying to achieve but also the challenges they face. It will also be important to understand your users’ constraints and vulnerabilities as you encourage broader solutioning across the organization, particularly when experimenting with AI, so you can continue to provide a safe space and an environment of trust and accountability. Consider building on your foundational user research through organization-wide workshops or focus groups to identify how your users and their communities might be vulnerable to AI harms, such as bias, discrimination, and privacy or intellectual property violations, and which specific risks pose the greatest threat to them. Resources such as the IBM Risk Atlas and the Fraunhofer Institute’s AI Assessment Catalog, which detail many types of possible AI risks, can be valuable for sparking these discussions. Future risk assessments of AI technologies you decide to adopt will depend highly upon the specific uses you intend for them. Ensuring broad diversity among your users and staff in this process will reveal more needs and risks and help you prepare to mitigate any unintended consequences of AI.

In the Define phase, our goal is to understand and articulate the core problem that needs solving before we begin ideating possible solutions. This phase is crucial for defining the right problem — one that is narrow enough to allow for actionable solutions but broad enough to accommodate a wide range of creative possibilities. At an organizational scale, this problem might be encapsulated in the mission statement or a strategic plan. Techniques like abstraction laddering can help to achieve the right level of granularity. Frame the mission statement as a problem statement (perhaps starting it with “we need to” or “how might we”) and then ask “how?” to narrow the scope of the problem. Conversely, if starting with a narrower objective, asking “why?” can broaden the scope. For example, asking “why?” to the original AMP problem statement of “how might we extract metadata from audiovisual materials?” could return the answer “to process and catalog audiovisual materials more efficiently.” This new problem statement, “how might we process and catalog audiovisual materials more efficiently” opens up the problem space to invite many more potential solutions. Expanding and narrowing this problem space for your organization until it includes a wide range of problem statements that reflect your users’ needs can pave the way for serendipitous solutions.


Other techniques and principles from HCD can also encourage the generation of unexpected solutions with AI: Emphasizing iteration and ideation, and allowing staff ample access, time, and space to experiment with new tools without the pressure to solve an immediate problem or produce immediate results. Inviting diverse perspectives can foster more creative solutions, offering staff opportunities to share their findings or collaborate with individuals outside their functional areas in AI experimentation, such as through group prompt writing. Creating sandboxes, tools, and safe environments for staff to use or manipulate data, better understanding of potential risks, and testing new ideas can facilitate innovation without the risk of exposing public users to premature or risky outcomes.

With a solid foundation in understanding user needs, AI risks, and an organizational problem space, organizations may also consider defining an AI values statement, similar to the one crafted by the Smithsonian. Such a statement can guide responsible AI development and use and to open a conversation with the communities they serve. By documenting and effectively communicating these values along with the potential benefits and risks of AI within a clearly but openly defined problem space, libraries can provide staff with guideposts to align their discoveries. As these solutions find their respective problems, they can be integrated into new, more structured cycles of the HCD process using the empathize and define phases to better understand the use case and ideation, prototyping, and testing

to explore potential implementation methods and evaluating the solutions’ success and value against an established risk framework.

As libraries begin to navigate the evolving landscape of artificial intelligence, building the capacity and culture for serendipitous innovation offers a unique opportunity to not only revolutionize the way libraries support their users but to test organizational models outside of siloed, objective-driven methods that limit cross-functional collaboration. By embracing HCD methodologies and fostering a culture of curiosity, openminded experimentation, and embracing the unforeseen, libraries can lead the way in utilizing AI not just as a tool for optimizing current practices but as a catalyst for discovering new ways to connect with and serve their communities.


1. Busch, C. (2024), Towards a Theory of Serendipity: A Systematic Review and Conceptualization. J. Manage. Stud., 61: 1110-1151. https://doi. org/10.1111/joms.12890

Visualising Collection Records with the Use of Generative AI continued from page 31


1. Eryk Salvaggio, The Age of Noise,

2. Simon Loffler, Collection Video Transcription at Scale with Whisper,

3. Simon Loffler, Seeing Inside ACMI’s collection – part 2,

4. Simon Loffler, Embeddings and our collection,

5. ACMI Collection, Shelter (USA),

6. Runway Research, Gen-2: The Next Step Forward for Generative AI,

7. ACMI Collection, Immigration,

8. Patrick Lewis and Others, Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks,


Reader’s Roundup: Monographic Musings & Reference Reviews

Column Editor: Corey Seeman (Director, Kresge Library Services, Ross School of Business, University of Michigan) <> Visit him at

Column Editor’s Note: I read an article recently that I cannot shake from my head. The article was from PetaPixel, a great news source for photography and my never-ending need for justifying more photographic equipment. The article was about an Arizona State Representative Alexander Kolodin who freely admitted that he used ChatGPT to help craft part of the bill moving through the Arizona Legislature on “deepfakes.”1 The article quotes from the original piece in The Guardian and goes through his rationale.

Kolodin was writing House Bill 2394 that hopes to establish criteria to determine if something is real or not. While claiming that he was by no means a “computer scientist,” he sought help while writing the law. He told The Guardian “... I was kind of struggling with the terminology. So I thought to myself, well, let me just ask the subject matter expert. And so I asked ChatGPT to write a definition of what was a deepfake.” When this was revealed, he came under criticism. Via X, he responded “Why work harder when you can work smarter.” He did not use punctuation, so I am not sure if it was a statement or a question. There is so much here to be furious about.

Sitting at my home office in Ypsilanti, Michigan, I wondered how hard it might be to find someone who might actually be a subject matter expert on the topic of artificial intelligence. Maybe it is my mad-librarian skills, but it took less than 30 seconds to find 18 experts in the field of artificial intelligence from Arizona State University, all of whom might actually be interested, willing and able to provide guidance to a state representative.2 Or maybe he could have reached out to the schools that he attended (according his site): Georgetown as an undergraduate and University of Pennsylvania for Law School. Maybe since they clearly missed educating him, they could do an alumni “a solid.” Or maybe he could have asked someone at a local coffee shop.

Paramount to this entire article and action was the issue of authority and expertise. Who can provide you with a simple definition when you are struggling with writing a law? Should it be a so-called expert who might struggle with the definition because of the legal ramifications of the wording that makes it into the state law? Or is it a machine that spits out an answer in less time and often less cost that you could get otherwise?

us answer these questions. When our questions are less definite and have more nuances, relying on a “black-box” solution indicates that someone has no idea what they are asking. I suspect that many in government and higher education are taking the same shortcuts that Alexander Kolodin took. We begin to realize that having an answer vs. having an answer that is well thought out and makes sense are two very different things.

While artificial intelligence has been with us for sometime (think about spellcheck, GPS, and predictive word options on your phone, among others), the launch of ChatGPT in 2022 ushered in a whole new world where you can seemingly do more with less. And that is why (I believe) this will take off. The notion of being able to produce content, images and laws with relatively little effort is super attractive. ESPN has launched daily game previews that are written (or more accurately compiled) by AI. Almost all the previews have this statement on the bottom — “The Associated Press created this story using technology provided by Data Skrive and data from Sportradar.” These previews are factually accurate, but are about as useful as month old milk. But they generate a preview page without needing people to do the actual work. That means more clicks on the ESPN site without having to pay writers to generate content. I fear that it will be getting worse in the years to come.

While this is a grim approach to start the column, it is useful to applaud the work of people who undertake the work to understand something that they are reviewing. By having librarians carefully review these works, they have a sense as to how these reference and monographic works will fit into our collections. Our desire in this column is to ensure that we provide time with these works so librarians everywhere can make good decisions about where to place their valued financial resources. That is not something that can easily be done by sending a question to a GenAI service. That takes time and understanding.

needed a picture of our kittens if I am going to go off on AI.

There are many questions that have definitive answers. What year did the Elvis film G.I. Blues come out? (1960). Who led the National League in ERA in 1978? (Craig Swan of the New York Mets). Google and other search engines have long used AI to help

An interesting mix of books were reviewed for this column — including two reference books and two monographs about our field. If there is a common theme here — it is likely that they were reviews that were completed. To hope and the goal continues to be to catch up with the column. Like world peace, that appears to be aspirational, but a good goal nonetheless.


Fred and Ginger are now one year old. I figure that I

Thanks to my reviewers who submitted reviews for this issue. They are: Carolyn Filippelli (University of Arkansas-Fort Smith), Priscilla Finley (University of Nevada, Las Vegas), Alexandra Godfrey (Librarian of the Senate of Pennsylvania) and Jennifer Matthews (Rowan University). As always, I want to thank them for bringing this column together.

If you would like to be a reviewer for Against the Grain, please write me at <>. If you are a publisher and have a book you would like to see reviewed in a future column, please also write me directly. You can also find out more about the Reader’s Roundup here — atg-readers-roundup.

Happy reading and be nutty! — Corey

Environmental Resource Handbook: A Comprehensive Source Book for Individuals and Professionals. Grey House Publishing, 12th edition, 2023/2024. 978-1-63700-551-4. 1168 pages. $155 Also available online through Grey House Online Databases.

Reviewed by Carolyn Filippelli (Reference Librarian, Boreham Library, University of Arkansas-Fort Smith) <>

The Environmental Resource Handbook is an authoritative, up-to-date resource for environmental information. With its inclusion of wide-ranging statistics, reports from government agencies and special subject sections, it is an essential reference. Organized for convenience of use, Section One includes the U.S. Energy Information Administration’s Annual Energy Outlook and the EPA’s Strategic Plan. In addition to these documents, categories such as Associations and Organizations, Conference & Trade Shows, Consultants, Environmental Law, Research Centers, and Green Products & Services provide an impressive collection of sources.

Guide to the ATG Reviewer Ratings

The ATG Reviewer Rating is being included for each book reviewed. Corey came up with this rating to reflect our collaborative collections and resource sharing means and thinks it will help to classify the importance of these books.

• I need this book on my nightstand. (This book is so good, that I want a copy close at hand when I am in bed.)

• I need this on my desk. (This book is so valuable, that I want my own copy at my desk that I will share with no one.)

• I need this in my library. (I want to be able to get up from my desk and grab this book off the shelf, if it’s not checked out.)

• I need this available somewhere in my shared network. (I probably do not need this book, but it would be nice to get it within three to five days via my network catalog.)

• I’ll use my money elsewhere. (Just not sure this is a useful book for my library or my network.)

Section Two includes statistics, tables, and charts on topics such as air and drinking water quality, municipal waste, children’s environmental health, green metro areas, and toxic environmental exposures. Since much of the statistical data provided begin with the 1950s, it may be useful for historical comparisons. The EPA Report on the Environment, information from the United States Department of Agriculture on food crop production, pesticides, and habitats, and the EPA’s Wasted Food Report are eye-opening. Also very informative are the U.S. Census Bureau ’s Environmental Revenue and Expenditures by State and the U.S. Centers for Disease Control’s National Report on Human Exposure to Environmental Chemicals. Although many of the government statistics and reports included in this work are also available on the web, their organization in this one volume is a great convenience and a timesaver to those conducting research on these topics.

The thought put into the organization of this work is evident in the three easy-to-use indexes: an Entry Index, a Geographic Index, and a Subject index. In addition, there is a guide to Acronyms & Abbreviations and a Glossary of Environmental Terms. This is the 12th edition of this title, a work that has proven its merit as an important source for environmental information. For a handy and well-organized resource on the environment, do not look further. Current interest in the environment will make this timely volume a popular and frequently consulted item. It would be of use in public and academic libraries as well as in state agencies and other organizations focused on environmental matters.

ATG Reviewer Rating: I need this in my library. (I want to be able to get up from my desk and grab this book off the shelf, if it’s not checked out.)

Kostelecky, Sara R., Lori Townsend, & David A. Hurley (Eds.). Hopeful Visions, Practical Actions: Cultural Humility in Library Work. Chicago: ALA Editions, 2023. 9780838938300, $54.99.

Reviewed by Jennifer Matthews (Head, Collection Services, Rowan University) <>

Today’s library is often seeking to better understand its patrons. The actions taken by librarians range broadly, encompassing everything from scheduling diverse programming or by determining ways to remove racist bias from old cataloging records. This process can be further strengthened when library staff at all levels participate in ventures that challenge long-held beliefs and practices. One such practice is cultural humility which, as defined by Kostelecky, Townsend, and Hurley, is “the ability to maintain an interpersonal stance that is other oriented in relation not aspects of cultural identity that are most important to the other person, the ability to recognize the context in which interactions occur, and a commitment to redress power imbalances and other structural issues to benefit all parties.” (Hurley, Kostelecky, & Townsend, 20193 quoted p. xv).

In their latest work, the editors provide a framework for cultural humility that gives scope to the authors of the included chapters. This framework reminds readers that cultural humility comes from self-reflection and ongoing growth, remembering the person whose life experience you are trying to improve, and addressing power differentials rather than ignoring them. To best represent this, the book has been divided into four parts: origins, reflective practice, community, and hopeful visions. These sections contain library practitioners working with and


experiencing cultural humility in different ways while trying to incorporate it into their daily practice.

For instance, Mark Emmons discussed the intersection of cultural humility and servant leadership in his chapter. If unaware of the definition of servant leadership, it was coined by Robert Greenleaf as “the servant-leader is a servant first…. It begins with the natural feeling that one wants to serve, to serve first.” (Greenleaf, 19914 quoted p. 155). Emmons moves from this definition to addressing issues of humility and performance, power and authority, and cultural identity, which leads to how servant leadership and cultural humility can work closely together to make better leaders out of library staff.

In another chapter, Nicholae Cline and Jorge R. L ό pezMcNight challenge readers to consider cultural humility not as a definition but as a theory of change. Cultural humility is a complex and evolving topic that varies for each individual who considers and interacts with it. If librarianship were to consider cultural humility in such a fashion, it would broaden the scope of cultural humility from the field of librarianship to other fields and areas. Perhaps, as the chapter’s authors suggest, this is precisely the type of action this topic necessitates.

Kostelecky, Townsend, and Hurley have curated a thoughtful selection of writings (part chapters and part essays) that make the reader consider the aspects of cultural humility — thoughtful and provoking aspects as well as items that are not standard practice. Cultural humility is yet another framework to assist libraries in continuing to improve their efforts in the equity, diversity, and inclusion space.

ATG Reviewer Rating: I need this on my desk. (This book is so valuable, that I want my own copy at my desk that I will share with no one.)

Pedley, Paul. Essential Law for Information Professionals. 4th ed. London: Facet Publishing, 2019. 9781783304356, 349 pages. $70.19.

Reviewed by Alexandra Godfrey (Librarian of the Senate of Pennsylvania, Senate of Pennsylvania) <>

Whether we’re focused on copyright restrictions for faculty in a university or government Right to Know requests, most librarians understand that their work as information professionals coincides somewhere with the law. While we aren’t required to be lawyers, nor even permitted to dole out legal advice, we do understand more or less what we are required to do ethically and professionally, and where we have to draw the line. Post-2000, the information landscape gets even muddier with Freedom of Information Acts and, in the U.S., the Patriot Act. Essential Law, 4th edition, provides an update to a wide-reaching primer on law in general, with specific sections for the information professional and the librarian. With Facet Publishing being an entity of CILIP (Chartered Institute of Library and Information Professionals), a library and information association in the United Kingdom, the reference book is largely and almost uniformly concerned with UK law. That makes the text even more sprawling, since it focuses on UK law at large, with the varying and specific carve outs for the different countries within the UK.

Author Paul Pedley is the author of several books on law and information access and has held such prestigious appointments as Head of Research at the Economist Intelligence Unit, library

manager at a UK law firm and as a professional in government libraries. He’s authored books on copyright and access and patron and library privacy. While Pedley himself points out that information law is incredibly fast moving, readers should be aware that the book was published in 2019, and that there could be core discrepancies between the publication date and today’s five-year gap.

This reference source could easily be used as a textbook, with the first three chapters on General law and background, Library law and Copyright being especially salient to an information science student. These three chapters give the law of the land, legally, and provide the reader with a sense of UK law in general. Basics like Common Law and Civil Law, the court system and UK sources of law are all covered and should give the student the background information they may remember from secondary school or university. Library law specifically in the UK is very specific and the reader and librarian should certainly be aware of the main laws and acts that rule library land, like those that control local entities and later legal deposits. The almighty copyright rules also get their fair share of content, with basics being rooted in international concepts and decisions like the Berne Convention, UCC Universal Copyright and the TRIPS Convention.

The following thirteen chapters would be best used as a reference. Covering everything from data and patron privacy to licensing and cybersecurity, the average librarian and information professional just cannot account for all the legalities and specializations covered. Better to be aware of their existence in the text and reference as needed. Moreover, the reader should be aware that this is UK specific. Copyright specifically is a beast that many countries have huge variations on and it may just not be a good use of time to be largely aware of the UK’s version of intellectual property protection if it does not apply in your workplace. For those law librarians in the UK and others who work directly with legal professionals and issues, this book would be useful as a primer and reference material. UK law alone as it applies to and contrasts to countries world-wide is certainly interesting, but only to the legal enthusiast. For any of us outside of the UK, it just isn’t worth the dense investment in source material since so much won’t apply. Even personally, as a government law librarian who found the information curious and interesting, I just wouldn’t have a basis of use for it in the United States.

UK Audience | ATG Reviewer Rating: I need this in my library. (I want to be able to get up from my desk and grab this book off the shelf, if it’s not checked out.)

For Librarians in other countries | ATG Reviewer Rating: I’ll use my money elsewhere. (Just not sure this is a useful book for my library or my network.)

Sawtelle, Jennifer, editor. Magill’s Literary Annual 2023: EssayReviews of 150 Outstanding Books Published in the United States During 2022. Ipswich, MA: Salem Press Inc, 2023. ISBN 9781637004753, 780 pages. $210 print, $210 eBook; Institutional access to online archive of 1977-2022 annual volumes is included with purchase.

Reviewed by Priscilla Finley (Humanities Librarian, University Libraries, University of Nevada, Las Vegas) <>

Among the readers’ advisory tools on the market today, Magill’s Literary Annual fills an important niche. Instead of


aiming to be comprehensive, the goal of the series, as described in the publisher’s note, is to “cover works that are likely to be of interest to general readers that reflect publishing trends, that add to the careers of authors being taught and researched in literature programs, and that will stand the test of time” (vii). In a climate where partisan-driven book challenges are increasing, a collection like this does important work in helping librarians communicate the context, value and relevance of these books to a wider community of readers, teachers and scholars.

In contrast with capsule reviews available in reader advisory databases like NoveList and Gale’s Books and Authors, the four page essay-reviews in Magill’s Literary Annual build the case for the significance of each selected work by assessing the author’s background and goals and the point of view, focus and methods of the work under discussion. Fiction, poetry, nonfiction, memoir, biography, graphic novels and more are represented, for both adult and YA audiences; care has been taken to select works by members of groups who have historically been underrepresented, including works that have been translated into English. It’s worth noting that this series draws mainly from the catalogs of commercial trade publishers, with only a few selections from small presses or university presses.

Practical features of each entry make it valuable for the day to day work of librarians. A block of reference information at the beginning of each entry includes bibliographic details, type of work, and information about the work’s geographical setting and era when relevant; a capsule summary is included, along with publisher-supplied cover and author photos and a brief author biography including awards. These are all essential material for librarians promoting a work on the shelf or in a display. A list of principal characters or persons discussed communicates elements of each work’s scope and tone as well as being a nice enhancement to improve keyword searches in the online archive.

The thirty reviewers include teachers, literature professors, librarians, scholars, writers and readers from a broad range of academic backgrounds and communities. Their essay-reviews incorporate detailed summaries and assess strengths and challenges that target audiences may encounter when reading the work. They conclude with an analysis of critical assessments in reviews and a bibliography of reviews drawn from newspapers, magazines and web publications.

For example, Emily Turner’s essay-review of Jonathan Escoffery’s short story collection If I Survive You presents details of the ascent of Escoffery’s career and offers commentary on the content and reading experience of interlocked short stories — “not a passive activity but one that requires readers’ attention and willingness to be challenged” (260). She identifies a cohort of other writers exploring Caribbean immigrant experiences and pulls out common themes, and she cites and addresses both compliments and critiques issued in reviews of the work.

Librarians and teachers are likely to tolerate, if not appreciate, the print-centric format of this work. Care has been taken to

make the eBook version reader-friendly — for example, an annotated table of contents includes a capsule review of each work and hyperlinks to the essay-review. Consistent formatting of the entries makes the work easy to skim for a particular facet of interest or keyword search. However, if one does not know that this series exists, it is a real challenge to find the volumes in a library catalog or on a list of databases or LibGuide — and this is a real shame. Individual essay-reviews that I spotchecked were not indexed in Book Review Index Plus or other databases and citations to them were not integrated into the central index of leading discovery services. Developing a strategy to expose item-level indexing of each essay-review to library catalogs and databases would be a key improvement to give the reviewers’ hard work the exposure it deserves.

Salem Press offers institutional purchasers access to 46 years worth of previous volumes in online format. Notions of canonicity have shifted intensely since 1977, but a yearbook that reflects commercial publishing trends and highlights works that demonstrate cultural significance through a record of promotion and critique will be an important primary source for future retrospective analysis.

In addition to supporting collection development and reader advisory work, librarians in the unenviable position of needing to defend the contemporary literary landscape and argue for the value of literary work to book review committees, school boards or political grandstanders will benefit from the measure of cultural weight that this work documents.

ATG Reviewer Rating: I need this in my library. (I want to be able to get up from my desk and grab this book off the shelf, if it’s not checked out.)


1. Growcoot, Matt. ChatGPT Used to Write Part of Arizona State Law on Deepfakes. PetaPixel, May 23, 2024. Accessed on May 28, 2024 at https://petapixel. com/2024/05/23/chatgpt-used-to-write-part-of-arizonastate-law-on-deepfakes/

2. Arizona State University Media Relations and Strategic Communications – Experts on Artificial Intelligence. Accessed on May 28, 2024 at expert-tags/artificial-intelligence

3. Hurley, D.A., S. R. Kostelecky, and L. Townsend. “Cultural Humility in LiLibraries.” Reference Services Review 47, no. 4 (2019): 544-55.

4. Greenleaf, R. K. Servant Leadership: A Journey into the Nature of Legitimate Power and Greatness. New York: Paulist Press, 1991.


Booklover — Call to Greatness

Editor: Donna Jacobs (Retired, Medical University of South Carolina, Charleston, SC 29425) <>

In a tumultuous time, one often turns to the written word for solace, inspiration, or guidance. For many, poetry is the perfect medium for this. Many of the Nobel Laureates are poets and their works give us emotional food, many times when we are starving and don’t even realize it.

Carl Gustaf Verner von Heidenstam was such a poet. One lovingly beautiful description of his work by the Swedish critic, Sven Söderman, states: “In the constellation of original artists who regenerated Swedish poetry at the end of the last century, Verner von Heidenstam was the most brilliant star.”

And with such an accolade, maybe it is no big surprise that Carl Gustaf Verner von Heidenstam was awarded the 1916 Nobel prize “in recognition of his significance as the leading representative of a new era in our literature.”

In today’s world, it is less and less of a challenge discovering works, especially poetry, from these early Nobel Laureates. With websites like, one can often get lucky, find something to enjoy, and maybe get a peppering of analysis.

“Åkallan Och Löfte” is one of four poems offered on the site and the title translates to “Invocation and Promise.” (https:// The poem captivates immediately. And Yes, recognizing that it is presented in translation, one can only image the power delivered in his native tongue.

Let this poem from the early twentieth century bring its power to you. We could easily imagine that it is written for us today. The analysis is included for it also nourishes the timeliness of these words.

“Invocation and Promise”

And shouted three neighbors: Forget the greatness you have laid in the earth!

I answered: Arise, our dream of greatness about domination in the Nordics! That dream of greatness still shines on us to play in new exploits.

Let up our graves, nay, give us men in research, in colors and writings.

Yes, give us a people on a precipitous brink, where a fool can break his neck.

My people, there are other things to carry in hand than a bread-filled Egyptian stew. It’s better, that pot is spilling, than the living heart rusts; and no people shall be more than you, that’s the goal, whatever the cost.

It better be reached by an avenger but to nothing see the years go by, it is better that all our people perish and farms and cities burn.

It is prouder to dare his roll of the dice, than thin with extinguishing flame.

It’s nicer to listen to a broken string, than never to draw a bow.

I wake up at night, but around me is peace. Only the waters storm and boil. I could throw myself down in longing as a praying warrior of Judah.

I don’t want to beg for sunny years, about harvests of gold without end.

Merciful fate, kindle the lightning that strikes a people of years of misery!

Yea, drive us together with scourges, and bluest spring shall bud.

You smile, my people, but with stiff features, and sings, but without hope.

You’d rather dance in silk robes than your own riddle suggests. My people, you shall wake up to youthful deeds the night you can cry again

Analysis (ai): Åkallan Och Löfte is a powerful and patriotic poem that expresses the speaker’s desire for greatness for his people. It urges them to break free from complacency and embrace challenges, fostering a spirit of progress and innovation.

Unlike many of the author’s more serene and natureinspired works, this poem reflects the tumultuous times of its composition, amidst political and social upheaval. It echoes the themes of national pride and determination that characterized the period.

The poem’s relentless refrain, “Giv oss män,” emphasizes the need for strong and capable individuals to lead and inspire the people towards their destiny. The speaker’s exhortation to risk and sacrifice contrasts with the tranquility of the surrounding landscape, creating a sense of urgency and determination.

“You smile, my people, but with stiff moves, and sing, but without hope.” — Carl Gustaf Verner von Heidenstam”

Overall, Åkallan Och Löfte is a compelling call to action, urging a nation to rise above its current state and strive for greatness through adversity and sacrifice.”



Section Editors: Bruce Strauch (Retired, The Citadel) <> Jack Montgomery (Georgia Southern University) <>

Legally Speaking — Après Moi Le AI Déluge

Column Editor: Abby L. Deese (Assistant Library Director for Reference and Outreach, University of Miami School of Law) <>

In 2020, the World Economic Forum speculated that 85 million jobs would be lost to AI automation.1 For years, our focus as information workers, as publishers, as service providers, has been on the impact of artificial intelligence on the workforce and the promise of efficiencies. However, we have been hearing that new technologies will steal our jobs for decades, as we adapt and grow our responsibilities to suit new workflows. I have never been one for job loss catastrophizing due to advancing technology — especially when the proven threat to the workforce is the careless grip of private equity squeezing every last drop of cash from any profitable business.2

A more tangible threat from generative AI that has begun to reveal itself as this new technology continues to develop: a threat to the integrity of information. The phenomenon of large language models (LLMs) generating less accurate text as they ingest AI-generated materials in newer iterations has already been appropriately dubbed “cannibalism,” 3 and early studies have shown that models lacking in original data contributed by human creators will swiftly collapse.4 If the information appearing in AI-generated materials is destined for a downward spiral, where does that leave our information ecosystem?

Paper mills have been flooding journals with submissions for years, but some publishers have pointed out that generative AI has “handed them a winning lottery ticket” by enabling them to rapidly increase the volume of submissions at little cost. 5 Some publishers have adopted tools to detect AI submissions but the reliability is inadequate to fight the volume.6 In fact, some of the tools rely upon the same generative AI they are designed to detect, doubtless creating a feedback loop! The mounting costs from this deluge of fake scholarship both in publishing and reputation have caused Wiley to close nineteen journals in the last year.7 And it’s not only academic publishers who have been affected — the literary journal Clarkesworld had to close its open submissions for the first time in 2023 due to a flood of AI submissions.8

I was curious to see how prevalent obviously generated AI submissions were after seeing several mentions on social media of ChatGPT signal language being identified in published scholarly materials. A cursory search of Google Scholar for the phrase “as of my last knowledge update” yielded 145 results.9 While several might still be the result of a disclosed usage, it was clear from the preview text that many were the result of careless editing. Users of ChatGPT had neglected to remove the key phrases produced by the generative AI before inserting the text into their articles. While you might expect to only

see known or suspected paper mills in the results, journals included those published by Springer and Elsevier, where ChatGPT seems to be seeing growing usage for summarizing literature reviews. Whether or not a generative AI tool is capable of authoring an original and well-reasoned law review article, as some claim,10 this poses a threat to the quality of intellectual inquiry.

As AI-generated books begin flooding the Amazon marketplace, 11 there are also reports that Google Books is indexing AI-generated texts.12 Generative AI is creeping into every aspect of the information environment, often without our input. Google launched its Gemini AI search overviews in May, and while Google’s new head of Search claims that the tool leans towards “factuality,”13 that doesn’t change the fact that many of the generated overviews are sourced from open-source Q&A forum Quora, which is notoriously full of junk questions and answers. 14 How exactly is Google’s AI search assessing this factuality? As with most generative AI products, including the search algorithms that preceded the current environment, process is an impenetrable black box of proprietary information. In fact, it’s hard to believe that a generative AI is at all capable of accuracy assessment when it is only a more advanced text predictor trained on a larger corpus.15

Generative AI is good at generating content that looks like the content it was trained on — this doesn’t mean it’s capable of “factuality,” and if it continues to ingest the products of its own process, we can only expect further degradation of quality and accuracy. Researchers rely on access to reliable and quality information, and librarians spend much of our time trying to facilitate that access. If information becomes glutted with cannibalized AI gunk, then what are we left to work with? Rather than reducing jobs or increasing efficiency, to me it sounds like generative AI will only create more and harder work for everyone involved in information work! So what’s to be done?

It’s unlikely that we’ll be able to put the sardines back in the tin at this point, but I’m not ready to shrug with the French nihilism implied by the title of this piece either. Regulation of generative AI is already a hot topic, and while it’s certainly


no Asimov’s Three Laws, the Biden administration’s Executive Order on the Safe, Secure, and Trustworthy Development and Use of Artificial Intelligence 16 is at least proof that governments are thinking about the risks and the promise of new technology.

But despite increasing efforts to create national borders on the Internet, those cables run all over the world and nothing less than a global solution will suffice. Groups like Pause AI want to slow development of AI technologies until their risks and threats are properly assessed and mitigated.17 However, others argue that a slowdown is insufficient and AI development must be halted entirely.18 While protestors have begun demonstrating across the world ahead of a global summit on AI scheduled in Seoul this December, many of their concerns seem rooted in the fear of catastrophe and the specter of Artificial General Intelligence,19 which ChatGPT and its ilk patently are not.

What solutions do you see in the future of AI? Will slow librarianship help us approach the curation of information

The Against the Grain team would like to express our sincere gratitude to Abby for her excellent writing and research on articles that keep us all up to date on the latest issues facing libraries in the Legally Speaking column. We wish her all the best! As she says above, please contact us at <> if you’re interested in writing for this column or have someone to suggest that might be a good fit.

with more deliberation to combat the dangers of the generative flood? Will regulation actually succeed in curbing the worst excesses of exploitative technology?

I asked ChatGPT for its input on the best way to regulate itself and these are some of the suggestions it helpfully provided from the depths of its corpus.20

Regulating generative AI to prevent potential negative consequences is indeed crucial. Here are some key approaches:

1. Transparency and Accountability

2. Ethical Guidelines and Standards

3. Regulatory Frameworks

4. Education and Awareness

5. Risk Assessment and Mitigation

6. International Collaboration

7. Continuous Monitoring and Evaluation

8. Public Engagement and Participation

By combining these approaches, policymakers can work towards a regulatory framework that promotes the responsible development and use of generative AI while mitigating potential risks to society.

Maybe there’s hope for us yet.

Column Editor’s Note: This is my final column as editor for Legally Speaking. If you are interested in writing for Against the Grain, you can pitch the editors at

endnotes on page 42



1. Bernard Marr, “Hype Or Reality: Will AI Really Take Over Your Job?,” Forbes, May 15, 2024, bernardmarr/2024/05/15/hype-or-reality-will-ai-really-take-over-your-job/.

2. Brendan Ballou, “Private Equity Is Gutting America—and Getting Away With It,” Opinion, The New York Times, April 28, 2023,

3. Amy Cyphert et al, “Artificial Intelligence Cannibalism and the Law,” Colorado Technology Law Journal (forthcoming), available at

4. Id.

5. Nidhi Subbaraman, “Flood of Fake Science Forces Multiple Journal Closures,” The Wall Street Journal, May 14, 2024, https:// ink=desktopwebshare_permalink

6. Clive Cookson, “Detection Tool Developed to Fight Flood of Fake Academic Papers,” Financial Times, April 13, 2023, https://

7. Subbaraman, “Flood of Fake Science.”

8. Alex Hern, “Sci-fi Publisher Clarkesworld Halts Pitches Amid Deluge of AI-generated Stories,” The Guardian, February 21, 2023,

9. In addition to the phrase, I excluded mentions of ChatGPT from results to reduce false positives related to scholarship about generative AI. These results were retrieved on 5/7/2024.

10. Sarah Gotschall, “Move Over Law Professors? AI Likes to Write Law Review Articles Too!,” AI Law Librarians, March 28, 2024,

11. Andrew Limbong, “Authors Push Back On the Growing Number of AI ‘Scam’ Books on Amazon,” NPR, March 13, 2024, https://

12. Emanuel Maiberg, Google Books Is Indexing AI-Generated Garbage, 404 Media, April 4, 2024, These were located using a similar method to my search of the Google Scholar index.

13. David Pierce, “Google Is Redesigning Its Search Engine – And It’s AI All The Way Down,” The Verge, May 14, 2024, https://www.

14. Jacob Stern, “If There Are No Stupid Questions, Then How Do You Explain Quora?,” The Atlantic, January 9, 2024, https://www.

15. Adam Zewe, “Explained: Generative AI,” MIT News, November 9, 2023,

16. Joseph R. Biden, Executive Order 14110, “Safe, Secure, and Trustworthy Development and Use of Artificial Intelligence.” Federal Register 88, no. 210 (November 1, 2023): 75191,

17. Anna Gordon, “Why Protestors Around the World Are Demanding a Pause on AI Development,” TIME, May 13, 2024, https://

18. Eliezer Yudkowsky, “Pausing AI Developments Isn’t Enough. We Need to Shut it All Down,” TIME, March 29, 2023, https://time. com/6266923/ai-eliezer-yudkowsky-open-letter-not-enough/

19. Gordon, “Why Protestors Around the World Are Demanding a Pause.”

20. ChatGPT 3.5, in answer to the prompt: “What’s the best way to regulate generative AI to prevent the downfall of society?”


Questions & Answers — Copyright Column

QUESTION FROM AN ARCHIVIST: Part of some remarkable collections include personal letters and correspondence from various individuals. The donor of a particular collection was the recipient of the letters, but not the author. The author of the letters in the collection was written by a third person that mailed the letter to the donor. From a copyright perspective, who is the “owner?” And were any rights transferred when the donor gave the collection of the personal correspondence to the archives?

ANSWER: This is a very common question, and one that comes up frequently in collections from donors that include letters and other personal correspondence, which are highly useful to patrons including scholars, researchers, biographers, and historians. As an archivist managing collections that include personal letters and correspondence, understanding copyright ownership and the implication of donation is crucial. This is particularly important when the letters were authored by one person, sent to another, and then donated by the recipient to the archives. To address these concerns, we must delve into the nuances of copyright law regarding authorship, ownership, duration, and transfer of rights. However, the basic answer is this: Possession of a 3rd party’s copyrighted work, like a letter, does not convey copyright ownership. While the saying goes that “possession is nine tenths (9/10) of the law” — that last one tenth (1/10) is often copyright, and it is something that should be assessed.

Under U.S. copyright law, the author of a work is generally considered the copyright owner. In the case of personal letters, the person who wrote the letters — the author — holds the copyright, not the recipient of the letters. This means that the author of each letter retains the exclusive rights to reproduce, distribute perform, display, and create derivative works based on their original correspondence. So, for example, if Julie writes a letter to Jack, Julie is the copyright owner of that letter. Even though Jack possesses the physical letter, Julie retains the copyright to the contents of the letter.

It’s important to distinguish between ownership of the physical letter and ownership of the copyright. When the donor (Jack, in our example) received the letters, he became the owner of the physical objects (the paper letters). However, this physical ownership does not transfer the copyright of the letters. Jack can read, keep, or sell the letters, but he does not have the legal right to reproduce or publicly display their content without permission from the copyright holder, Julie.

For the copyright to be transferred to the archives, the original copyright owner (the author of the letters) must explicitly transfer those rights, typically in writing. Such a transfer could include all or some of the rights associated with the work. Without such an explicit transfer, the copyright remains with the author.

In the context of archival donations, if Jack (the recipient) donates his collection of letters to the archives, this donation does not automatically transfer the copyright to the archive. The archives receive the physical letters, but the copyright remains with Julie (the author) unless Julie has explicitly transferred

her copyright to John or directly to the archives.

So, there are implications for archives in accepting collections with 3rd party personal correspondence. When receiving a collection of personal correspondence, archivists must be mindful of these copyright issues. The archives can legally hold and preserve the physical letters, but their ability to reproduce, publish, or display the contents is governed by exceptions to the copyright law, including exceptions such as the “library and archives exception” under 17 USC 108, and fair use, under 17 USC 107.

Yet, there is one more wrinkle in this question that is also worth mentioning. In many scenarios, it can be helpful to determine if the copyrighted work is still under copyright or in the public domain. In personal correspondence situations, this is defined by the work being considered “published” or “unpublished” under the copyright law.

In copyright law, the distinction between published and unpublished works is significant and has implications for the length of copyright protection, and certainly how the rights of these works are treated, including personal correspondence in archival collections.

A work is considered “published” when it has been distributed to the public by sale, transfer of ownership, rental, lease, or lending, or offered for distribution to a group of persons for purposes of further distribution, public performance, or public display. A common example of publication is making copies of the work available to the public at large.

An unpublished work is one that has not been distributed to the public. Private communications, such as personal letters, correspondence, and diaries, typically fall under this category. Unpublished works are often created and kept within a private context without the intention of public dissemination.

The distinction between published and unpublished works affects the duration of copyright protection. One of the best resources for finding the rules about the duration of copyright is Cornell University Library’s Copyright Services chart “Copyright Term and the Public Domain in the United States.” (Many of my colleagues refer to this as the “Hirtle Chart” because it was first made famous by Peter B. Hirtle, legendary archivist and copyright expert, in an article called “Recent Changes to The Copyright Law: Copyright Term Extension,” Archival Outlook, January/February 1999. As Peter always notes, his chart is based in part on Laura N. Gasaway’s chart, “When Works Pass into the Public Domain” which was hosted at University of North Carolina.) The chart is updated annually by Cornell University Library and continues to be a “go-to” resource for duration and public domain questions.

According to the chart, for unpublished works created after January 1, 1978, the copyright term is the life of the author plus 70 years. For works created but not published or registered before January 1, 1978, the copyright term is the life of the author plus 70 years, but it had a minimum term of protection until at least


December 31, 2002, and if published before this date, it extends to December 31, 2047.

Works published before January 1, 1978, are subject to different rules based on whether the copyright was properly renewed, typically lasting for 95 years from the date of publication if the renewal term was secured. Works published after January 1, 1978, receive the same protection term as unpublished works created after this date: the life of the author plus 70 years.

All of these terms have implications for personal correspondence and archives. In the context of personal correspondence and archival collections, understanding whether the letters are considered published or unpublished affects how they can be used and managed. Now, most personal letters and correspondence are considered unpublished works because they are typically shared privately between individuals rather than being distributed to the public.

As unpublished works, these letters enjoy robust protection, potentially lasting for the author’s life plus 70 years, and they are often subject to more restrictive rules regarding reproduction and distribution.

So, when personal letters are donated to an archive, the distinction between published and unpublished is crucial. Since these letters are generally unpublished, the archive should respect the long duration of copyright protection associated with unpublished works and inform users as to their copyright status if asked. Even if the physical letters are donated, without a transfer of copyright, the archive is limited to preservation and private study uses, unless fair use exceptions apply (e.g., for scholarship, research, criticism, or commentary).

Here are a few considerations for personal correspondence in archival collections:

• Verify Copyright Duration/Expiration: If you can, determine whether the letters are published or unpublished using the Hirtle chart. As noted above, most personal letters will be classified as unpublished. If possible, identify the authors and ascertain the status of their copyrights, including the duration and any potential heirs.

• Consider Copyright Exceptions or Obtain Permissions: If the letters are unpublished and still under copyright protection, and the archivist’s uses are outside of any of the copyright statutory exceptions (like archive reproduction, preservation, or fair use) an archivist might consider written permissions from the authors or their estates to use the content beyond preservation and private study. As always, it helps to ensure any copyright transfers are documented in writing, clearly specifying the rights that were transferred to the archive.

• Educating Donors: Informing donors about the distinction between physical and copyright ownership is key. There is a plethora of misunderstanding about copyright and possession. Exploring whether donors can help with copyright transfers from the original authors to ensure the archive can use the content more freely could be a useful way to provide education and understanding.

• Adhere to Copyright Law: As always, archivists should respect the limitations imposed by copyright law on unpublished works. However, they should also clearly understand that access for private research, study, or preservation is well within the rights of the archives. So, an archivist should utilize fair use or library and archives exceptions judiciously, understanding the specific conditions under which they apply. Patrons may make uses of the works, but archives are merely providing access to these works for personal study or research.

In closing, this question has multiple points in which an archivist can explore many facets of copyright and personal correspondence. When a collection of letters is donated to an archive, only the physical letters are transferred unless there is an explicit copyright transfer. Archivists should ensure they understand the copyright status of donated materials and document all agreements while making the collections accessible within legal boundaries. This careful management ensures compliance with copyright law and respects the intellectual property rights of the letter writers. Finally, the distinction between published and unpublished works in copyright law significantly impacts how personal correspondence is managed within archival collections. Unpublished works, such as personal letters, enjoy extended protection, necessitating careful handling to ensure legal compliance. By understanding these nuances, archivists can effectively navigate copyright issues, ensuring both the preservation of valuable historical documents and understanding the role of copyright in an archive’s accessbased mission.


Seeing the Whole Board — Academia, Libraries, and the Opening Variation in the Societal Game of Chess

I’m pleased to introduce a new column to Against the Grain focused on the social and public policy issues that are affecting academic libraries. Through my close collaboration with EveryLibrary, a political action committee dedicated to safeguarding the future of libraries nationwide, I have seen how advocacy can influence policy and impact the societal trends affecting our collective professional future. I hope this column will bring sector-wide attention to the significant challenges facing libraries and academia. The goal of the column is to bring professionals inside and outside librarianship to Against the Grain to help academic librarians see the whole board and determine how policies and technologies, among other emerging issues, will impact their institutions, stakeholders, and profession as the chess games of society move around them.

EveryLibrary is a 501(c)4 organization and political action committee focused on influencing the funding and policy ecosystem, fighting book bans, crafting pro-library legislation, and supporting grassroots groups nationwide. The EveryLibrary Institute is a 501(c)3 organization that researches the public perception of libraries, provides advocacy training, and hosts policy sessions.

For many years, the EveryLibrary teams have liaised with state library associations that are made up of academic, school, and public librarians. They have been cited as experts by the library and publishing trade press and local and national press, and they have worked with and been supported by publishers and vendors who know and care about libraries. The EveryLibrary teams have also formed working relationships and coalitions with organizations outside the library space who share similar concerns over policy, ideology, and academic discourse and similar enthusiasm about technology, shared alliances, and collaborations. The EveryLibrary team is made up of both library professionals and non-library subject matter experts. They have a broad view of changing policy and the chess games affecting libraries of all types. To continue the metaphor, they see all the pieces in play and the possibilities driving the changes that will impact not just libraries but education, research, technology, and society as a whole — the internal and external factors that are likely to impact libraries and the content and technology industry as a whole.

As for me, I have previously edited a column for Against the Grain about the future of libraries, so this is a variation on a theme. I am a non-librarian and a communications professional with 18+ years of experience working with library professionals, including school and public librarians, medical, corporate and special librarians, and academic librarians. During my career, my work has shifted from the routine work of a communications professional to a more eye-opening role advocating and lobbying against the wave of legislation impacting libraries. I have spoken about my advocacy work through my professional work, as a board member at EveryLibrary — now as a Senior Policy Fellow at the EveryLibrary Institute, and as a board

member at the American Library Association’s United for Libraries Division. Since 2017, that work has transitioned from rebutting claims of pornography in school databases and attacks on database providers (including litigation, legislation, hearings, and direct lobbying) to opposing book ban legislation and unconstitutional efforts at censorship. My day-to-day work life now includes confronting attacks on librarianship, opposing threats to criminalize the work of librarians, and analyzing current policies and gamesmanship — such as efforts to decertify the profession’s credentialing agency.

By 2022, many of us started to see new hazards coming for academia. The question was how they would manifest themselves. In a meeting at ALA in 2023 with academic library and LIS school deans, state library leaders, consortia professionals, and others, including EveryLibrary, PEN America, and ALA, it seemed clear that in academia, we would see a more asymmetrical effort rather than the library-specific policies we had seen in schools and public libraries. By 2023, we knew state agencies had been auditing courses at state institutions and calling for changes, moving from the humanities to STEM. DEI programs were just beginning to be targeted, using tactics seen in the CRT battles a few years earlier. Bills were being written about statewide contracts that demonstrated a shocking lack of awareness of resource pricing and contract scope. Seeing the whole board, I also wondered whether anti-ESG (Environmental, Social, and Governance) legislation, which in 2023 was limited to state-funded pensions and investments, would expand and impact state contracts and vendors. By early 2024, library vendors were indeed dealing with contract restrictions on their sustainability practices in certain states. These were the same practices that vendors were required to document when submitting tenders for business in the EU and elsewhere.

As you may be gathering, seeing the whole board is, in fact, a downer. However, my nearly two decades of experience with libraries have shown me that library professionals are highly credentialed individuals who toil under stereotypes that do not represent their capabilities and commitment. The complexities of the profession are astounding. Academic library professionals are on the leading edge of technology and have been for years. Data analytics, APIs, and AI are just some examples of now common or emerging library technologies that are just beginning to have an impact in areas outside of libraries.

Librarians’ ability to pivot and expand their horizons to match the challenges they face is part of a remarkable ability to build on knowledge. For example, privacy, a hallmark of libraries, has, over the years, expanded into data privacy. That corpus of knowledge now provides academic library professionals the opportunity to offer their privacy expertise to the larger institution as it grapples with student privacy concerns as well as


physical (e.g., swiping into a building) and online data tracking (e.g., the learning management system). Are librarians in those meetings? Have academic stakeholders made the connection between libraries, privacy, and technology?

Content, another hallmark of libraries, is often spoken about in terms of cost, with open access and open research having dominated much of the discussion for the better part of a decade. No doubt, the rising costs and stagnant budgets also influence the discussion as a longtime concern. When I look at content, I see the value, not just the value to the library, students, researchers, and faculty, but the value to the institution. I’ve contended in conference sessions and when writing that if library professionals talk to their department heads, presidents, and provosts about full-text journals, databases, and eBooks, they are unlikely to gain the coveted “seat at the table.” Instead, I’ve suggested they abandon the library language we all have become accustomed to and speak instead about connecting those stakeholders to the latest competitive intelligence to help them benchmark themselves against other institutions, researchers, and grant seekers. They are missing an opportunity if they are not connecting those stakeholders to the latest information on their core areas of interest, mission statements, policies, and strategies through their library resources. It is not only an opportunity to showcase the value of library content; it is an iteration, a reminder, of the value of the library appearing daily on the phones and laptops of those stakeholders. That value, once established, might be a critical factor in a conversation about what more the library could accomplish with additional funding.

What about larger issues, such as social and policy issues? Consider the long put-off reauthorization of the Higher Education Act. Having seen the damage done to school library professionals by No Child Left Behind more than 20 years ago, at EveryLibrary, we are thinking ahead to how to galvanize experts to ensure that libraries are represented, not “left behind,” during the reauthorization debate. Dual-enrolled students have pushed K-12 issues onto academic campuses. How should libraries respond? What alliances do we need to make to impact legislation that could affect access to information? Research into areas such as public health messaging, election integrity, and disinformation has faced harsh criticism, Congressional subpoenas, and Freedom of Information Act requests. What role does the library play in protecting researchers, students, and faculty and promoting academic freedom?

The year began with Congressional hearings that put presidents of revered institutions in the hot seat, leading to resignations and what seems like a rapid shift in the perception of academic institutions themselves. What can the library do to restore the academic reputation and avoid misperceptions from becoming accepted truths?

These are the atmospherics within which academic library professionals navigate along with their traditional roles amidst the changing policy, funding, and technology environment. Library professionals, as part of the academy, must be aware of and manage these issues while navigating ideological and leadership influences that are likely to impact state-funded institutions, library schools, and library professionals. This column will endeavor to see the whole board and bring library and non-library perspectives to the discussion about libraries, the academy, the industry, and where we go from here.

Kathleen McEvoy is a long-time communications executive with direct experience in crisis communications, media and public relations, and public affairs. She has lobbied and created strategies to address legislation in multiple U.S. states and has met directly with state executives and legislators to call out the unintended consequences of legislation that impacts digital privacy and data stewardship, as well as data security risks and personally identifiable information (PII). Kathleen has presented on crisis communications, social media, communications, and media training. She has written about emerging technology, the current political landscape, and the legislative and policy issues impacting academia, research, and intellectual freedoms.

Kathleen is a board member of EveryLibrary, the national political action committee for libraries, and is a senior policy fellow at the EveryLibrary Institute. Kathleen has also co-chaired a task force on intellectual freedoms as part of the American Library Association’s United for Libraries division, where she is an executive board member and serves on the Intellectual Freedom, Public Policy, and Advocacy Committee.


And They Were There — Reports of Meetings 2023 Charleston Conference

Column Editor: Sara F. Hess (Business and Entrepreneurship Librarian, 309 Paterno Library, Pennsylvania State University, University Park, PA 16802) <>

Column Editor’s Note: Thanks to the Charleston Conference attendees, both those who attended on-site and virtually, who agreed to write brief reports highlighting and spotlighting their 2023 Charleston Conference experience. In 2023, the conference moved to an asynchronous format: the in-person conference (November 6-10) was followed two weeks later by a virtual week (November 27-December 1) that included online-only sessions and presentations as well. Conference registrants had the opportunity to view recordings and see slides (if available), to re-visit sessions they saw “live,” or to visit sessions they missed. Without a doubt, there were more Charleston Conference sessions than there were volunteer reporters for Against the Grain, so the coverage is just a snapshot. In 2023, reporters were invited to either provide general impressions on what caught their attention, or to select individual sessions on which they would report.

There are many ways to learn more about the 2023 conference. Please visit the Charleston Conference YouTube site, videos?app=desktop, for selected interviews and videos, and the conference site, for links to conference information and blog reports written by Charleston Conference blogger, Donald Hawkins, chsconfnotes/. The 2023 Charleston Conference Proceedings will be published in 2024, in partnership with University of Michigan Press.

In this issue, we have the second installment of reports from the 2023 Charleston Conference. The first group in the April 2024 issue included “Top 5” lists and reflections from first time attendees. This group includes Neapolitan sessions as well as Stopwatch and Innovation sessions. Again, we’d like to say thank you to all of our report contributors for giving us a peak into the conference sessions they attended. And be sure to check out the next installment of reports in the September 2024 issue. — SFH


From Metadata to User Interfaces: Collaborative Initiatives to Make Content and Systems Accessible

Reported by Ramune K. Kubilius (Northwestern University, Galter Health Sciences Library & Learning Center) <>

Presented by Nettie Lagace (NISO), Bill Kasdorf (Kasdorf & Associates), Elizabeth German (Princeton University Library), and Amy Thurlow (EBSCO) — Video recording available at

German provided “The Library Perspective” — elements: library fulfillment, the accessibility lens, and collaborators.

Addressing “Accessibility at EBSCO,” Thurlow spoke from the vendor’s perspective, spotlighting various projects, initiatives, and requirements, repercussions of noncompliance, groundbreaking developments, educating partners, and she summarized accessibility features in the new EBSCO interface. Lagace reviewed the role of NISO, the communities of practice, and standards development principles. Kasdorf drilled into “Accessibility Metadata,” the formerly “nice to have,” now “must have,” the many metadata schemes, some identical, some crosswalks, accessibility properties, and led it all back to a forthcoming new NISO standard. There are many moving parts and through this presentation, one all the more appreciates that there are partners and collaborators who are working to ensure that it all comes together. The useful slide deck (in schedule), particularly Kassdorf’s resource lists and links are worth another look.

Decolonizing the Catalogue – Vendor/Library Collaborations to Increase Diversity and Inclusivity

Reported by Shannon Tennant (Coordinator of Library Collections, Elon University) <>

Presented by Elaina Norlin (ASERL), Erica Bruchko (Emory University), Louise Hemmings (AM), and Kasia Figiel (SAGE) — Video recording available at MmyIFTdrC5Y?si=VwdSopI9kILpseXv

The panel discussed re-centering marginalized voices in archival collections from vendor and library perspectives. Norlin noted that librarians were facing time constraints and burnout as well as political obstacles. Libraries want to collaborate with vendors but trust needs to be built. Bruchko described three projects at Emory: work with the Library of Congress’s African American Subject Funnel; new metadata and contextualization of a 19th century photograph collection; and the transformation of a print bibliography of African American newspapers to a linked data project. The vendors then spoke about their primary source projects. Hemmings identified a lack of framework as a challenge for doing collaborative work and echoed the need for trust-building. Hemmings stressed that AM listened to feedback from customers and advisory boards when describing and contextualizing content. Figiel described how SAGE created new content to address gaps in their Research Methods platform. They examine their own positionality as vendors. Editorial choices must be made, but while considering questions of who owns data, who benefits from it, and how to ensure it is widely accessible to diverse groups. This issue was echoed by an audience question about the tension between building trust with libraries and vendors’ inherent commercial purpose.


Subscribe to Open: The Challenges and Successes of an Innovative Model for Open Access

Reported by Amy Lewontin (Collection Development Librarian, Northeastern University) <>

Moderated by Antonia Pop (VP Publishing Div., University of Toronto Press), and Presented by Andrea Lopez (Annual Reviews), Sandra Shaw (University of Toronto Press Journals), Wendy Queen (Johns Hopkins Univ. Press / Project MUSE), and Katherine Brooks (Columbia University Library) — Video recording available at

The Subscribe to Open (S2O) session held in Charleston this year was very well planned and well executed by the presenters, and useful for those new to the model, as well as those already involved in it. Andrea Lopez, from Annual Reviews, represented the publisher that began the model, and she discussed how they began with the idea of finding a revenue stream that would be cost neutral for subscribers and a way to affordably open the material. And S20 would directly benefit society, by opening the journal content to a huge group of readers, and not charge authors APCs. Lopez also talked about something that many of the other speakers also discussed, and that was a true sense of openness and community involvement around S20. It was only fitting that the next speaker, Sandra Shaw, of the University of Toronto, talked about how UT went on to create a new, topical journal at the press, that followed that model, the Journal of City Climate Policy and Economy, with similar goals (to Annual Reviews). The journal would offer broader dissemination of research on climate change, low subscription rates, and no APCs. Wendy Queen, of Project Muse, spoke next, on the idea of an aggregator using S20, highlighting that this is going to be a project with many challenges as Muse is working with so many different university presses. Lastly, Katherine Brooks, of Columbia University said that S20 would be one worth watching.


Stopwatch Session 1 — Ramune K. Kubilius, Moderator

Reported by Ramune K. Kubilius (Northwestern University, Galter Health Sciences Library & Learning Center)


Presentations Included the following:

Open Access Viewpoints and Self-Archiving Decisions of Public Health Faculty — Presented by Kayla Del Biondo (Yale University), Holly Grossetta Nardini (Yale University), and Kate Nyhan (Yale University). Note: Kayla Del Biondo presented.

OA Strategy, Meet Operational Workflows: Let’s get things done — Presented by Sarah Beaubien (University of Guelph), and Paul St-Pierre (University of Guelph).

Artificial Intelligence for the Information Profession — Presented by Antje Mays (University of Kentucky Libraries).

A Library-Press Publishing Model for the Humanities Monograph? Cornell’s Signale Program — Presented by Kizer Walker (Cornell University), and Jane Bunker (Cornell University Press).

SUSHI-fied: The Venture and Value of Implementing SUSHIProtocol — Presented by Katherine Heilman (UNC Greensboro).

Video recording available at iHnUZg9Vy7A?si=R4Fa1bLwLQ2h4Yzx

Ideas abounded. Del Biondo spotlighted her group’s faculty survey of perceptions, and use of results to re-design library services, to fill gaps (“growth areas”), and invited attendees to view /re-use: JISC OA Pathfinder toolkit (, and the survey ( Beaubien and St-Pierre spotlighted their library’s Open Investment Strategy Committee (OISC), the regularly updated open investment list, and evaluation criteria. Mays introduced AI and shared an AI use case for web-based new book title list generation. (Participants were invited to answer an online poll, results to be incorporated in the conference proceedings). Walker and Bunker described 3 publication tracks (“formats”), including templative designs, and were open about the fact that the partnered Signale Program model, though generating continued interest in the field, is specialized (Germanic Studies), perhaps not scalable as it stands. Heilman shared her process improvement passion, describing usage data consolidation (SUSHI in LibInsights), admitting it required high initial effort.

Innovation Lightning Round 2 — Sandy Avila, Digital Library Sales Manager, SPIE, Moderator

Reported by Shannon Tennant (Coordinator of Library Collections, Elon University) <>

Presentations Included the following:

From ChatGPT to CatGPT: The Implications of Artificial Intelligence on Library Cataloging — Presented by Richard Brzustowicz (Instruction and Outreach Librarian, Carlow University)

Investigating Librarian Perspectives on Generative AI: Preliminary Findings From a R&D project — Presented by Jesper Solheim Johansen (Head of User Research, Keenious), and Lars Figenschou (Senior Academic Librarian, University of Tromsø (UiT) – the Arctic University of Norway).

Are we virtually there yet? VR Trends in Higher ed and What it Means for the Librarian — Presented by Joy DuBose (Associate Professor, Extended Reality and Gaming Librarian, Mississippi State University), Michael Carmichael (Senior Director, Editorial, Sage Publishing), and Ben Naughton-Rumbo (Commercial Director, Bodyswaps).

Increasing Institutional Impact Intelligence with PID Graph Enabled Repositories — Presented by Gretchen Gueguen (Product Owner, Open Science Framework (OSF)).

Video recording available at TDmthNKBnVY?si=4YgQ-YufqDoz_JJx

Four presentations discussed how new technologies impact libraries. Richard Brzustowicz from Carlow University used ChatGPT to create a MARC record. When compared to the OCLC record, the AI record was not as good; but ChatGPT will likely improve over time. Brzustowicz argued AI can increase efficiency and supplement rather than supplant human cataloging. He concluded that we must address ethical issues and bias in the data used to train these tools. In the second talk, presenters surveyed librarians from nine countries about their knowledge and concerns about AI. Participants also


interacted with an AI prototype. Respondents acknowledged the opportunities AI presents, but have concerns about the ethics of the new technology. The third presentation discussed virtual reality (VR). The first presenter, from Mississippi State University, said that many faculty are excited about VR products and urged librarians to learn about it. The other two presenters, representing Sage and Bodyswaps, discussed their product which allows users to practice management, public speaking, and communication skills. The final presenter explained how persistent identifiers (PIDs) can be assigned to documents, researchers, and institutions. Once coded with PIDs, entities that produce and consume research can be linked together with that research and become more discoverable.

Innovation Lightning Round 4 — Ramune K. Kubilius, Moderator

Reported by Ramune K. Kubilius (Northwestern University, Galter Health Sciences Library & Learning Center) <>

Presentations Included the following: How transformative is it? Tools to Assess the Performance ‘Read and Publish’ Type Agreements — Presented by Adam Der (Max-Planck Digital Library), and Rita Pinhasi (Vienna University Library).

A Unique Approach to Funding Open Access Ebooks — Presented by David Parker (Lived Places Publishing), and W. G. Maltarich (New York University Libraries).

Open Access Books Collection as a Swiss Army Knife – Let’s Find New Ways to Serve Our Patrons — Presented by Ronald Snijder (OAPEN Foundation).

A Case Study on Open Access Management Challenges and Solutions in Finland — Presented by Vivi Billeso (ChronosHub), and Martin Jagerhorn (ChronosHub). Note: Only Martin Jagerhorn presented.

Video recording available at https://youtu. be/0y3OJYj9Zo4?si=ImbKCIvaRg-H19ts

Kubilius had the easy task of moderating, while conference program planners should get credit for pulling together a session that provided glimpses into OA models and tools, international practices and models. Der and Pinhasi shared information on two open tools created by the OA2020 Working Group on Transformative Agreement Assessment. Jagerhorn spotlighted one country’s approaches to OA, incorporated in its E-Lib, Finnish Publication Forum, and focused on some work at Tempere, Finland’s 2nd largest university (where an OA support form still exists), and its use of ChronosHub. Parker described a publishing initiative conceived by a psychology faculty member, the growing niche eBook collections (including international editors/authors), in a platform designed for course reading, with a model of author opt-in (and no fee) OA. Maltarich opined how the model meets important (to libraries) requirements and leaves some concerns still to be tested with this new (since 2020) initiative. Snijder described OAPEN’s work on a (soon to be made public) library dashboard, recommender service, and Memo, a metadata export module for OA books. During on-site discussion, some possible future collaborations were identified; during online week discussion, presenters shared “next step” projects.

This concludes the Neapolitan, Stopwatch, and Innovation Session Reports we received from the 2023 Charleston Conference. Watch for the individual Session Reports from the 2023 Charleston Conference which will appear in the Septmeber issue of Against the Grain. In the meantime, you can visit the Charleston Conference YouTube site for selected interviews and videos, and the Charleston Conference site for links to conference information and blog reports written by Charleston Conference blogger Donald Hawkins.


Wandering the Web — Online Muses, Part I: Reference Resources for Art, Music and Other Creative Endeavors

Introduction to “Online Muses,” a Two-part Series

The Muses of ancient Greek mythology and religion were said to be goddesses of science, literature and the arts, including the writing of lyric songs, who helped people to sing, dance, write poetry and create works of art. Creativity nowadays might not come as easily unless you have found a way to tap into the powers of these nine inspirational deities. However, your creativity can flourish if you have access to well-curated and attractive websites that assist in artistic activities, specifically visual art and musical composition. In a two-part Wandering the Web column series, I will first describe and review online resources for enhancing your imagination and expressing your vision, whether you enjoy writing songs or creating art in various media. In Part II, I will present some amazing online museums that you can visit virtually for free.

Websites That Assist Artistic Activities

TinEye is a very powerful search and recognition engine that implements unique digital signatures for each retrieved image, which it then compares to other indexed images with lightning speed. TinEye Reverse Image Search is free for the general public’s non-commercial use, and professional artists can subscribe to TinEye API. The free site is located at https:// . The commercial use site enables you to buy bundles of searches, which you can use via your browser or by integrating TinEye into personal apps. Essentially, you search by uploading an image and you can also drag and drop a picture or search by a URL. TinEye is always crawling the web, increasing the number of images available in its vast search engine. As of May 14, 2024, there were over 67.3 billion indexed images! Just for fun, I uploaded my face photo from work onto TinEye and within 1.9 seconds the site located the photo, listed the file name, and provided the digital file’s dimensions in horizontal and vertical measurements and the file size in kilobytes. (See more instructions on how to use TinEye at how and information about the site at category/226-about-tineye.) Please note that most pictures retrieved by the TinEye platform are copyright protected and you will need to contact an image owner directly if you want to use their image in your artwork. Your own personal artwork or uploaded images are never saved on TinEye.

The same principles regarding copyright of indexed images apply to the 10 BEST TinEYE Alternatives & Similar Sites (2024). Ms. Walker has done a great job of compiling and reviewing websites that are similar to TinEye, including the pros and cons of each site. Another excellent review of TinEye alternatives can be found at AlternativeTo at https:// (Please see the full citation for all reviewed websites in the References.) An example of a TinEye alternative, SauceNAO at, is an easy-to-use image source locator whereby you can utilize an

array of options to customize your image search. Like TinEye, you can search photos from the billions of indexed images from its enormous search database that includes Shutterstock, Anime, movies, shows, and more. According to AlternativeTo at, the site’s name SauceNAO is taken from the slang phrase “Need to know the source of this Now!”

There are online resources that promote a variety of artistic pursuits. The multinational technology company, Yandex LLC, provides products and services online, but their Internet search engine is very well-suited to enhance your creative activities. Yandex Search at allows you to search by text, voice or image to find images, as well as music and other webpages.

Artists can always use a color palette generator and WIXBlog has published the 8 best free color palette generators online at WIXBlog helps to “harness the power of the Internet by finding the best color combos for your designs,” whether a logo is being created for a client or you are a fashion or website designer. This blog site also provides a free website builder so you can try out various color combinations. The best free color palette generator tools include the well-known Adobe Color, as well as sites like Khroma, which interestingly pairs color combination algorithms based on color psychology with five different display backgrounds and web design tools.

The excellent blog posting by Crave Painting entitled The top 10 most useful Websites for Artists located at https:// provides an authoritative and fun selection of a variety of art-related websites. The sites include an electronic publishing platform, a visual database with a large collection of pictures that you can reference for drawing practice, a color palette generator by Canva, a drag-and-drop web builder, details about an artists’ collective, and useful information about how to effectively use YouTube for “artsy things,” as Crave Painting’s blogger put it. This blog is anonymous and readers’ personal data are kept strictly confidential. I would highly recommend browsing, not just through The top 10 most useful Websites for Artists article, but also consider taking a look at other postings on this “… art blog about all things drawing and painting: materials, techniques, artists, and of course lots of exercises.” Drawing exercises, for example, are aimed at all levels and include a 15-minutes-a-day drawing exercise routine, five easy exercises for beginners and pros, and five exercises to painlessly learn perspective drawing. I won’t list the many internal links on this site, other than to encourage you to start your artistic journey by looking at the Crave Painting home page, https://


Musical Websites of Note

Many people are aware of Google’s song retrieval, where you tap the microphone icon and ask, “What’s this song?” There are other websites that allow you to conduct large-scale and detailed song searches from the immeasurable playlist that resides on the Internet. Every Noise at Once at helps you to find an artist in its simple search box or you can scan its unique and extensive word cloud, and go down the rabbit hole of Every Noise’s playlist that offers literally thousands of granularly-described song styles — 6,291 individual genres, to be exact — which are then subdivided by popularity, emergence, modernity, youth, femininity, engagement, background, tempo, duration, color, and name! Mr. McDonald, the intrepid creator of this unbelievably thorough music identification site, also allows you to click each genre to sort by similarity and make a comparison with related music styles. I’m not sure about the legality of this site, insofar as you can then play entire song previews, but it appears at a glance that you would in fact need to purchase each complete selection on Spotify. (Having a husband who is a prolific songwriter and singer, I am particularly sensitive to the concept of a musician’s creations being available on the Internet for free. Access to original, copyrighted music does not fall into the same category as open access journal articles, which is also contentious and is outside the purview of this column!)

Songcraft at is a creative platform that helps musicians to become master writers by enabling you to write lyrics, chord progressions, and guitar tabs in Songcraft’s songwriter’s pad. This collaborative site is free, you sign up through Google or Facebook, and it is used by over 80,000 songwriters, according to the site. is a community of 24-hour spaces extended around the globe at over 700 studios which are used by producers, vocalists, DJ’s, band members, podcasters and even dancers. You can become a partner, film YouTube or student movies at Pirate, or if you generally need studio space, you can contact at about-us/. The Pirate Blog provides news, interviews and artist opportunities for musicians including drummers, podcasters and dancers at Songwriters might be interested in reading the article How to Write a Song – Tips and Free Template for Songwriters at how-to-write-a-song/. Helpful instructions include seven steps to writing a song, elements of a song, song structure, songwriting books and resources, and much more. (See the full citation in the References at the end of this column.)

One final site worth looking at is SongbookPro, a digital songbook app for laptops and tablets at https://songbook-pro. com/. This site lends flexibility to all your songwriting activities. You can easily type or draw your notes freehand using a stylus or your finger, fonts are available in 20 colors along with a highlighting mode, and the chorus sections and chords are fully

customizable. Additionally, you can import songs from websites supported by the app or import .pdf files. SongbookPro also includes a fully integrated metronome and a key transposition feature. This app provides clear and well-organized instructions, a user’s manual with licensing information and FAQ’s, and even video tutorials and a downloadable manual with all documentation if you want to read a printed copy.

More Inspiration: Online Museum Sites

If you find yourself needing a bit more inspiration, “Online Muses, Part II” will present and review some amazing international museums that you can visit online for free. See you next issue!


Acan, L. (2022, June 6). How to Write a Song – Tips and Free Template for Songwriters. [Blog]. en/blog/how-to-write-a-song/#songwriting-books AlternativeTo. (2024). TinEye Alternatives. Top Image Search Engines and other similar apps like TinEye https://alternativeto. net/software/tineye/

Crave Painting. (2024). Home [Blog]. https://cravepainting. com/

Crave Painting. (2024). The top 10 most useful Websites for Artists

McDonald, G. (2024). Every Noise at Once. https://everynoise. com/ (2024). About us. company/about-us/ (2024). Pirate Blog. SauceNAO. (2024). Image source locator. https://saucenao. com/

SongbookPro. (2023). The digital songbook app for tablet and laptop.

Songcraft LLC. (2024). Write better songs. Together. https://

Spivak, E. (2024, May 5). 8 best free color palette generators online. WIXBlog [Blog].

TinEye. (2024). Reverse image search. Walker, A. (2024, May 11). 10 BEST TinEye Alternatives & Similar Sites (2024). GURU99. Microsoft,and%20Windows%2010%20and%20above Yandex. (2024). A fast Internet search engine. https://yandex. com/


Libraries, Leadership, and Synergies — Workflow Considerations for the Evolving Collections Landscape

Column Editor: Antje Mays (Collection Analysis Librarian, University of Kentucky Libraries) <>

Column Editor’s Note: Building on the April issue’s article on the broader shifts and trends in the collections landscape, this article analyzes fifteen recent postings across the collections space and analyzes the advertised positions’ areas of responsibility, required minimum qualifications, preferred qualifications, and workflow insights gleaned from the job descriptions. — AM


As described in the previous article in this series, collections work has expanded from its earlier focus on subject-based selection, purchase, and collection management. Advances in technology and growing complexities in publishing and knowledge dissemination have heightened collection work’s need for analysis, copyright and intellectual property, license evaluation and negotiation, systems interoperability for access and authentication, Open Access publishing and infrastructures. Concurrently, intellectual depth, broad knowledge base, and wide range of content knowledge endure as core components of collections work toward meaningful content selection and collection evaluation in support of instruction and research (Hirsh, 2022). 1 Building on the last article’s overview of collections changes, this article analyzes current job advertisements for prevailing collections functions and expressed skill needs. The workflow observations gleaned from this analysis form the basis of a future in-depth job-design study.

The Interplay Between Library Trends, Task Realms, Skill Priorities, and Workflows

Collection strategy, development, and management steeped in foresight, planning, keeping track of emerging information formats, and subject-based needs, collection analysis steeped in data and assessment, electronic resources and licensing steeped in formats, licensing, copyright and intellectual property, authentication, and system interoperability, and acquisitions steeped in procurement, vendor negotiations and consortial collaborations, financial management, and physical materials in-processing are inherently entwined. Ascertaining the stability of digital content for long-term preservation informs technology and platform exploration, contract negotiation, evolving purchasing frameworks, and rethinking the balance of physical and digital content.

Skills across the library collections realm include vendor relations, negotiation, working with consortia for collaborative collection development and cooperative purchasing, finance, quantitative and qualitative analysis, inventory management, physical processing, ensuring discoverability, electronic resources management, physical and digital preservation, supporting internal and external constituencies and partners, and workflow management and supervision. Increasing

availability and range of digital content requires increasing analysis to check for duplication with existing library resources.

This interrelated nature of these task realms is prone to areas of overlap, blurred lines of demarcation, position scope creep, and expanding portfolios with competing priorities and growing workloads.

Job Analysis

Method: The analysis was based on 15 recently advertised positions across acquisitions, collections, collections analysis and strategy, and electronic resources and licensing positions with broader collections tasks in U.S. academic libraries of various sizes. 13 were gleaned from the Chronicle of Higher Education Jobs2; two postings were gleaned from Association of Research Libraries Jobs3: Two positions head Acquisitions, two oversee both acquisitions and collection development, three associate library dean or equivalent roles whose portfolios include Collections, one broad Collections position, one covering strategy and analysis, two collection analyst postings, two covering collection strategy, and two electronic resources and licensing positions with broader collections duties.4 Text analysis on these position descriptions was performed using Voyant Tools5 for frequency of terminology to identify themes across task families and skill families.6 Text data visualization of the relative predominance of key themes was performed using Worldclouds.com7 for word cloud creation.8

Component Analysis from Job Descriptions: Duties and Qualifications

All position types: Text analysis and word cloud visualizations of job duties across all position types reveal predominance of collection development, resources, connecting with library functions across instruction, research, user support, data, management, as well as working with faculty and learners. Skill Priorities: Not all of the 15 job ads identified for this article made distinctions between required minimum qualifications and preferred qualifications beyond the basics. Where only one category of requirements was present, these were folded into the required

Figure 1: Duties across all job types: word cloud visualization


qualifications. Experience and demonstrated skill permeate the qualification requirements. Amongst required qualifications, all jobs’ minimum / required qualifications include the ALA-accredited master’s degree in library/information science or equivalent, demonstrated experience in the functional areas, understanding of adjacent library services, broader knowledge across the spectrum of library services, and specific skills including written communication, writing, management, and visions around diversity and inclusion. Across all position types, preferred qualifications include understanding collections and content in the context of supporting the parent universities’ teaching, learning, and research. Preferred qualifications also include experience in the positions’ core functions, knowledge of trends in collections and scholarly communication, as well as skills in data analysis and visualization, systems and analytical tools, resource discoverability, management, and licensing.

platforms’ acquisitions modules, vendor relations, budget management, project management, strategy, cross-functional collaboration, and ability to succeed in a tenure-track faculty role. Preferred qualifications include systems-specific experience, supervisory experience, prior experience in collections-related library functions, knowledge of the ACRL Framework for Information Literacy, experience in instruction, research support, reference, and experience with research guides and reference chat systems.

Two of the jobs combine Acquisitions and Collections, with duties emphasizing collections and acquisitions and cover analysis, statistics, reporting, licensing, resources and access, staff management, and reference and instruction. Skill priorities: Required and preferred qualifications alike emphasize experience and demonstrated skills. Required qualifications include experience with acquisitions, licenses, electronic resources, management, assessment, standards and analytical tools, communication skills, and collaboration. Preferred qualifications include knowledge and experience with authentication technologies, specific systems and proxy server tools, reference and instruction, temperament traits such as leadership and comfort with ambiguity, and active professional service at the national level.

For the granular position analyses within job types, additional word clouds were created to visualize priorities within each job type’s duties, required qualifications, and preferred qualifications. The text columns’ space limitations prevent accommodating these additional word cloud visualization images. A pdf file containing all word-cloud visualizations with annotations is available at antjemays/52/

Two of the above-referenced job ads are specifically for acquisitions librarians. Predominant duties emphasize resource procurement, receiving and processing, budgeting, expenditures, fund management and analysis, reporting, coordination and collaboration across library and university departments, management, staff oversight and coaching, and strategies for acquiring needed resources. Skill Priorities: One of the acquisitions postings made no distinctions between required and preferred qualifications and emphasized skills including knowledge of the parent university’s procurement and financial rules, library policies and procedures, knowledge of trends in librarianships, and instructional experience. For purposes of text analysis and word-cloud visualizations, these were grouped with the other acquisitions ads’ required qualifications. For other acquisitions roles with separate listings of required and preferred qualifications, required qualifications include familiarity with library service

Three of the positions are Associate Dean, Associate University / College Librarian roles including Collections in their portfolios . Job duties encompass leadership, negotiation, workflow strategies, data analysis, assessment, reporting statistics to internal and external constituencies, collections strategy, collections budgeting, resource sharing, resource discoverability, and managing the institutional repository. Skill priorities center on mature leadership and insight, interpersonal skills, strategy, and advanced knowledge of collections ecosystems. Required qualifications include experience in collections, strategic planning, building partnerships, budgeting, administration, supervision and skill development, educational technologies, and inclusive practices. Preferred qualifications include experience with leading diversity, equity, and inclusion efforts, strategic partnerships, interpersonal skills conducive to positive work environments, coaching and mentoring, copyright leadership, consortial collaboration, subject liaison experience in a quantitative discipline, data analysis, specific systems and analytics tools, and sustained record of professional service and scholarship.

One generalized Collections position in a community college library encompasses job duties around collection assessment, management, and weeding, library system analytics reports for data-informed decision-making, embedding learning resources in instructional support platforms, oversight of technical services functions, and public-facing collection exhibits. Skill priorities are largely outlined in the preferred qualifications: The only stated required qualification is the master’s degree in library and information science. Preferred qualifications include experience in collection development and strategic evaluation, collection policy creation, in-person and hybrid instruction, educational technologies, student support, and knowledge of collection trends and affordable course content.

One position combines collection analysis and strategy, but the advertised job duties are strategic in nature, where analysis plays an implied supporting role in the background. Duties emphasize strategic direction-setting across all disciplines and formats, vendor relations, price and license negotiation, hosting vendor visits and informational sessions, reviewing


Figure 2: Required qualifications - all job types: word cloud visualization
Figure 3: Preferred qualifications - all job types: word cloud visualization

campus needs, and staying informed on broader trends across the collections and scholarly communication ecosystem. Skill priorities are fully described under the required qualifications umbrella, with no section for preferred qualifications. Required qualifications emphasize experience and demonstrated skills in vendor negotiations, budgeting, data analysis for decision support, strategic thinking, as well as organizational and interpersonal skills.

Collection analyst: Fewer positions focused on analysis were found than positions focused on collection strategy. Only one position focused on collection analysis was found in the job search for this article. This position at an R1 university was advertised twice in a five-month window. Job duties include coordination of data collection, analysis, and assessment for decision support across the collections realm, automationassisted workflow optimization, and collaboration with consortium peers. Skill priorities span technical, analytical, and organizational skills: Required qualifications include demonstrated data proficiency, relational database design and programming, SQL, as well as organizational and interpersonal skills. Preferred qualifications show the preferred skills and knowledge of tools that would ready a candidate for quantitative analysis and assessment design conducive to data-informed strategies. Desired proficiencies include coursework in quantitative analysis, visualization, applied statistics, analytics, specifically named software (R, SPSS, SAS, Tableau, Microsoft Power pilot), knowledge of collection management standards and library systems’ reporting tools.

Two standalone collection strategist jobs were identified for this project, both at R1 institutions. One position is strategically situated to influence library-wide collection directions, while the other position additionally serves as subject bibliographer and academic liaison. Job duties include collection strategies, allocations, report design, data analysis and visualization for decision support and reporting to library constituencies, collection assessment for accreditations and program reviews, licensing, vendor relations, and subject-based instruction and research support. Skill priorities emphasize experience across acquisitions, collections, and technology. Required qualifications include experience with collections in all formats, acquisitions, electronic resources, library services platforms, subject and language background, and communication skills. Preferred qualifications include experience and skill in collection development and management, collection assessment, data analysis and visualization, data harvesting for library collections, licensing agreements, scholarly communication trends, reference and instruction, technologies in the disciplines served, interpersonal and organizational skills, and active professional service and scholarship.

Two electronic resources and licensing positions were included in this project for their wide sweep across collection development and analysis duties. Although one is advertised by a private liberal arts college and the other by a land-grant R1 institution, the two positions share a broad range of duties across the collections space. Job duties reach beyond electronic resource lifecycles, discoverability, access, and licensing, including broader collections and acquisitions duties around vendor relations, negotiation, resource evaluation, assessment, and data analysis. Skill priorities emphasize demonstrated experience. Required qualifications experience in data tracking and analysis, interpersonal, collaborative, organizational, communication, and problem-solving skills, and commitment to diversity. Preferred qualifications include problem-solving, experience in electronic resources and collections, working with

faculty, analytical skills, management and supervision, data analysis with tools including Excel, Tableau, experience with library systems analytics and configurations, knowledge base metadata analysis, and vendor negotiation.

Task Realms, Workflows, Overlap, Scope Creep

Observations on workflows and scope: The broad range of collections components were found across all 15 positions examined for this article, despite the positions’ considerably different levels within their respective libraries’ hierarchies. The positions at the associate dean level include collection strategy and analysis, vendor negotiation, licenses, constituency relations and outreach, management, as well as training and development. Yet many of these components also appear in the other collections and acquisitions positions situated at more operational levels. Several of the positions also include liaison duties; one included donor relations.

Collections-related task realms: Management includes workflow oversight, workforce supervision and development, faculty mentoring, and strategic oversight of collections directions and processes. Strategy encompasses big-picture overview, future-orientation, forward-facing direction-setting, and planning the goals’ achievement. Context determines the shape and meaning of strategy in a given position. Examples of strategy include overall direction for collection activities, vendor negotiations, and consortial collaborations. Analysis implies gleaning insights through quantitative data analysis and interpretation of qualitative factors. Proliferating data and measurements enable libraries to infuse data-informed practices in collection assessment, organizational communication, and strategy formulation. Continuously improving data-collection capabilities and analytical tools have increased the categories and amounts of data at librarians’ fingertips. In academic and research institutions, collecting decisions are informed by a mix of quantitative data on enrollment, research metrics, e-usage and hardcopy circulation, shelf-occupancy projections and digital content for potential digital replacement and hardcopy weeding and retention decisions, and qualitative information on the institutions’ intellectual directions in form of evolving research priorities and academic programs, and subject-knowledgepertinent assessment of collection strengths and gaps at the intersection of quantitative and qualitative analysis. E-resource analysis includes cost-per-use, usage, turnaway statistics, and content coverage data. Collection management, as referenced in several of this project’s job ads, includes weeding and shifting of physical collections, acquiring new pieces where unique content germane to specific disciplines only exists in physical form, as well as determining potential digital replacement for physical content. Acquisitions tasks include allocations, fund management, procurement and payment processes, vendor relations and negotiation, and processing incoming physical materials. Licensing duties include license evaluation and negotiation. E-resources tasks include authentication, access management, system interoperability, and analyses as described above.

Outside the collections realm: This project’s job ads also described duties not directly tied to collections, including donor relations, liaison to campus Information Technology to ensure system interoperability, reference and research consultations with rotating and ad-hoc scheduling, and academic liaison roles encompassing instruction, research support, and focused collection development. Several, but not all, position descriptions also included the open-ended category of “other duties as assigned.”


Not explicitly mentioned in the job ads: None of the job descriptions include strategic scanning of the evolving Open Access landscape or evaluating OA frameworks for economic sustainability, despite the growing importance of broadly determining the place of OA content and infrastructures across the library collections landscape. Additionally, neither physical nor digital preservation functions were found in the stated job duties.

Defining strategy tasks: While most collections-related tasks are clearly defined, strategy has yet to reach a universal definition in the context of library collections work: While the concept of strategy looks forward and sets direction, duties and priorities for actual collection strategist positions vary widely. Practitioner conversations around strategy positions have revealed immediate priorities of catch-up needs in physical collection maintenance due to pressing space concerns or obsolescence in subject-focused physical collection areas gradually accumulated over long periods of time. These catchup priorities are a marked departure from understanding strategy as forward-looking overview and direction-setting — a discrepancy which risks confounding and disappointing newly recruited incumbents whom the position description led to expect forward-facing strategy work.9 More broadly, collection strategy is written into several organizational levels in the job ads examined for this article. Clear definitions and delineations of strategy tasks are warranted to avoid competing perspectives, workflow frictions, and gatekeeping across multiple layers of hierarchy.

Workflow complications, overlaps, and recommendations: Across the library collections space, several task realms are inherently intertwined. Due to their interrelated nature, an assortment of functions resides in several job types in the ads examined for this project. For example, vendor relations, negotiations, fund allocations, as well as license evaluation and negotiation are not only found in acquisitions and collections positions, but also in associate dean and collection strategist roles. Additionally, licensing duties are found in the postings focused on electronic resources and licensing librarian roles. Collections analysis and strategy are deeply intertwined, as thoughtful decisions are informed by analysis of targeted collection and content areas and sweeping assessment of broader trends. Analysis therefore appears in many of the broader library collections positions and collection strategist roles, while strategy also appears in the two collection analyst postings. Consortial collaboration and negotiating consortial group purchases are tasks found at varying levels of the organization, pointing to the importance of clear distribution of library-consortium interface roles. Spreading task realms across too many functional areas and organizational levels spawns overlap, conflicting perceptions of responsibility, duplication and dilution of effort across multiple roles, mutual impediment, and working at cross-purposes, rather than the intellectual cross-germination aimed at widened perspectives conducive to augmented solutions-finding.

Scope Creep and Future Research

Several of this project’s job postings include full-service academic liaison roles, reference service, and “other duties as assigned.” One included liaising with campus Information Technology; another position included donor relations. Rising volume and range of library users’ needs, expanding intellectual scope, and growing complexity with continual technological growth and diversification across instructional

and productivity realms have led to corresponding rise in the depth, breadth, range, and volume of tasks across patronfacing liaison realms and collection-focused management and analytical task areas. 10, 11, 12 As task volume and range grow in each domain, incumbents face enlarging portfolios and competing priorities between ineffectively delineated task realms, often in a context of declining resources and staffing. 13, 14, 15 Future research will study workflow solutions and skill strategies for evolving work across library collections.


1. Sandra Hirsh. 2022. Information Services Today: An Introduction. Third edition. Lanham, Maryland: Rowman & Littlefield Publishers.

2. Chronicle of Higher Education Jobs. https://jobs.

3. ARL Jobs.

4. Anonymized 15 job ads used for text analysis and visualizations (xlsx). Compiled from Chronicle and ARL Jobs by author.

5. Voyant Tools.

6. Voyant corpus analysis of 15 recent job ads (pdf), produced by author.

7. Word clouds:

8. Word cloud visualization of 15 recent job ads’ duties, required qualifications, and preferred qualifications (pdf), produced by author. antjemays/52/

9. Confidential conversations with professional colleagues in several research-intensive large university libraries between 2023 and 2024. The names of professionals and institutions are withheld for privacy.

10. Jennifer Nardine. 2019. “The State of Academic Liaison Librarian Burnout in ARL Libraries in the United States.” College & Research Libraries 80 (4): 508–24. doi:10.5860/ crl.80.4.508.

11. Duane E. Wilson, Kendall Campbell, and Isabella Beals. 2024. “Subject Librarian Definition and Duties: Connecting the Library and the University.” Journal of Academic Librarianship 50 (3): N.PAG. doi:10.1016/j. acalib.2024.102867.

12. Annette Day and John Novak. 2019. “The Subject Specialist Is Dead. Long Live the Subject Specialist!” Collection Management 44 (2–4): 117–30. doi:10.1080/0 1462679.2019.1573708.

13. Laura Banfield and Jo-Anne Petropoulos. 2017. “ReVisioning a Library Liaison Program in Light of External Forces and Internal Pressures.” Journal of Library Administration 57 (8): 827–45. doi:10.1080/01930826.2 017.1367250.

14. Jonathan Bull and Alison Downey. 2024. “Not a Special Project Anymore: Creating a Culture of Sustainable Deselection and Gifts-In-Kind with Limited Staffing.” Technical Services Quarterly 41 (1): 64–81. doi:10.1080/0 7317131.2023.2300512.

15. Karen Jensen. 2017. “No More Liaisons: Collection Management Strategies in Hard Times.” Collection Management 42 (1): 3–14. doi:10.1080/01462679.2016. 1263812.


Optimizing Library Services — Financial, Technical, and Legal Challenges with Introducing AI-enhanced Functionality and Content into Library Collections

Column Editors’ Note: As technological advances continue to evolve, publishers must adapt to these changes to better support the libraries they serve. IGI Global fully understands its responsibility to enhance its practices to grow with the increased impact of Artificial Intelligence within the academic industry. In order to better align with the modern impact of AI integration into library spaces, IGI Global is dedicated to ensuring that each publication includes a plethora of rich metadata as well as detailed descriptions that effectively define the various points of impact within our researchers’ work. By supporting the various library platforms with in-depth information, our content effectively aligns with the growth in AI functionality, thus increasing the search success and accessibility to the various topics and subjects within our portfolio.

As Artificial Intelligence continues to be at the forefront of conversation and development within the academic community, IGI Global has made this topic a focal point with our offerings. As such, IGI Global has compiled over 500+ titles of cuttingedge AI research within the areas of business, medicine, education, computer science, and engineering to create our Artificial Intelligence e-Book Collection. With IGI Global’s quick and comprehensive publishing process, this collection provides professionals with the latest growing trends and developments within this ever-evolving technology. To ensure that your library’s holdings contain the latest in AI research, including titles such as Emerging Advancements in AI and Big Data Technologies in Business and Society (979-8-36930683-3), AI in Language Teaching, Learning, and Assessment (979-8-3693-0872-1), and AI Approaches to Literacy in Higher Education (979-8-3693-1054-0), visit https://www.igi-global. com/e-resources/topic-e-collection/artificial-intelligence/229 for more information on IGI Global’s Artificial Intelligence e-Book Collection. — WH


I just returned from the NISO+ Conference, which is consistently an excellent venue for networking, collaboration, and for staying informed about the latest developments, technologies, and implementations that impact the scholarly information ecosystem. Not surprisingly, there were numerous engaging and informative sessions on artificial intelligence (for which we perhaps humorously but inaccurately labeled the program “All AI All the Time”) that typically pertained to ethical and workforce considerations in addition to more applied information about the various AI platforms, protocols, and tools currently in production in academic libraries.

Although I was very interested in the former, from the standpoint of practical implications and potential applications

for Coastal Carolina University (CCU) Libraries, I was particularly interested in the latter. As I’ve been pondering various AI tools and implementations, numerous questions have arisen that will have both immediate and long-running implications for libraries, particularly within the context of library collections budgets.

Tools and Resources

Although academic libraries investing in a variety of technologies and tools beyond information resources to enhance services and provide additional support to our user community isn’t a particularly new trend, the line between traditional library information resources and the AI tools recently in production or being beta tested to enhance these resources is further blurring this line between tools and resources.

CCU University Libraries has been trialing, beta testing, or already subscribed to three such online resources that have integrated AI functionality into their platform. The first one recently rolled out their product which is described as “a generative AI product to help researchers and research institutions get fast and accurate summaries and research insights that support collaboration and societal impact.” This AI-enhanced functionality is essentially housed in their database but is being marketed and sold at an upcharge.

The library is also testing a second AI research tool (beta) which now features added AI functionality on a content page to generate a summary, show related topics and recommended content, and asks context-based questions and receives answers about the content at hand. Even though this is a nonprofit service, I anticipate a surcharge for this functionality to be integrated into their existing product.

Lastly, the library recently subscribed to a search and discovery tool that leverages AI to increase the discoverability and evaluation of research, which was launched in 2018 but very recently acquired by another company. This tool draws its information from databases, journal articles, and metadata in the sciences, and has access to some other publisher’s content. This raises the question that since the library may already subscribe to the information resources from these publishers, are we in essence being charged twice for the publisher’s corpus of information resources, then likely paying a significant upcharge for the functionality that the tools utilize for generating content?

Impact on Collections Budget

This, in turn, raises an even larger question. Based on these three examples, can we assume that AI enhancements are going to consistently be bundled as an added expense to already stretched library budgets? Many libraries that are already faced


with very limited and ever-contracting collections budgets will face either cutting existing resources from a collections budget that’s already been cut to the bone to be able to afford new AI tools and enhancements to existing resources, or put our students and user community at a competitive disadvantage by not giving them access to the tools they’ll need for academic success and will likely use throughout their careers. The financial implications of incorporating such advanced technology cannot be overlooked.

This is particularly relevant when we consider that library collections budgets are still contending with the impact that Open Access (OA) funding models have created. While there is an increasing recognition of the importance of OA publishing at CCU and in the academic community writ large, many libraries, either via the increased subscription costs from negotiating and utilizing transformative agreements or from directly underwriting APC charges, have at least partially borne the financial burden of supporting OA publishing at their institution.1

I’ve recently employed several strategies for generating modest savings in our collections budget to support OA publishing, including:

• Conducted an in-depth quantitative and qualitative analysis of every resource subscription across the entirety of our collections, and unsubscribed to almost twenty relatively high cost, low use databases that were typically multi-disciplinary in nature with numerous redundant titles

• Cancelled all monograph approval plans and moved to a completely use-based or request-based acquisitions model

• Leveraged consortial deals and economies of scale to grow our ebooks collection from ~300,000 to 1.5 million titles

However, many libraries are locked into their existing resource contracts and librarians won’t have the flexibility I enjoyed when making these sometimes-difficult decisions. The collections budget has barely stabilized from the significant financial disruption of these various Open Access publishing models, yet now the library is faced with deciphering how best to fund AI tools and AI-infused enhancements and upgrades to existing resource platforms.


Discovery platforms are being influenced by the integration of AI in both functionality and content, which introduces multifaceted legal issues concerning the ownership and rights associated with AI-generated content and associated potential limits on text and data mining. The legal landscape surrounding AI-enhanced functionality and the content it generates in a library’s discovery platform is still evolving, so it’s unclear where the fine line is drawn between a publisher’s corpus of scholarly information and the manner in which this content is mined and regenerated by AI in the library’s discovery platform.

Library users have generally moved away from discrete, vendor-specific information search and retrieval tools to robust, centralized discovery platforms where citations, abstracts, and many times the full-text journal content are made available. But is it conceivable that legal issues pertaining to AI-generated content might force users to return to each publisher’s or aggregator’s search interface to search for and retrieve information?3


As libraries continue to subscribe to AI tools and upgrade existing resources with AI-enhanced discoverability and retrieval, how do we demonstrate evidence of impact and return on investment to show that the added cost is justified? There are numerous challenges and problems with this process, making it difficult to define clear and measurable objectives and establish key performance indicators for measuring impact. This is due to several issues, including a lack of standardized benchmarks and industry-wide metrics for assessing AI in libraries which makes it challenging to compare with peer and peer-aspirant institutions. In addition, there’s a complexity in attributing outcomes based solely on AI tools. Determining the specific impact of artificial intelligence on user engagement, workflow efficiencies, or any related cost reductions is challenging due to the novelty of AI implementations and the relative subjectivity and qualitative aspects of user satisfaction.

Possible Solutions

With all of the promise and transformative potential of integrating AI technologies into library services and resources, what are some possible solutions to the complex financial, technical, and legal landscapes that libraries are facing?

In addressing the limited budgets and upfront costs of implementation, adopting a “build it or buy it” strategy for identifying a variety of free and commercial AI platforms and protocols for specific library needs, along with a phased implementation approach can significantly help defray the costs of expensive AI technologies. Prioritizing these solutions while collaborating with consortia and regional partners for identifying potential economies of scale can also help cut costs, as can utilizing standardized protocols and participation in open standards initiatives for integration and interoperability with existing library systems. Lastly, developing clear policies and ethical guidelines for use of AI-generated content and AIenhanced discoverability for safeguarding user privacy can help avoid potential legal complications that might arise.


The financial, technical, and legal challenges associated with introducing AI-enhanced functionality and content into library collections are complex and multifaceted. Libraries must carefully consider budgetary allocations, resource management, and long-term sustainability to successfully navigate these challenges. Collaborative approaches and a strategic focus on demonstrating the value and impact of AI technologies can contribute to the effective integration of these resources, ensuring that libraries continue to evolve in the digital age.


1. read-publish-good-academy/#:~:text=What%20 Next?,all%2Dopen%20publishing%20can%20emerge

2. doi/10.1177/03400352231196172


The Digital Toolbox — One Size Does Not Fit All: Part 2

How Lending Models Impact User Experience

Column Editor: Steve Rosato (Director of Digital Book Services for Academic Libraries, OverDrive, Cleveland, OH 44125) <>


Part two of our series examines the complexity of lending models and how each impacts the user experience. Librarians from a variety of institutions discuss the challenges and potential solutions for offering class sets, the importance of robust bibliographic records for discoverability, and the ongoing debate over the prioritization of open access resources. You can read part one of this series, which introduces how librarians manage digital collection development to deliver on access and affordability, here: the-digital-toolbox-one-size-does-not-fit-all-part-1-of-2/

Do different lending models and licenses impact the user experience for students and faculty?

Al Salatka (UTC): Yes, we’d like to be able to offer more “class sets” for classes and for reading groups. That model doesn’t seem to be available for many of the books that are being used in higher education. Being able to request books as class-sets when that option isn’t available would be very helpful.

Are there challenges in ensuring discoverability and equitable access to content across various models?

Katie Gohn (UTC): All of the OverDrive titles are available through our PrimoVE discovery tool. We also love that we can create marketing campaigns and collections that we can link directly to within social media or blog posts. Discovery is only going to be as good as what data is provided in MARC records. More complete bibliographic records that better conform to descriptive standards, like RDA, and are consistent would allow a more seamless process and less staff time spent remediating records so they appear as similar as other eBook records as possible.

“We will always look to provide access to Open Access materials, but not all Open Access content is worthwhile, and it can overwhelm discovery systems if we aren’t careful.”
— Katie Gohn

Victoria Turner (ECC): Very much so. As stated earlier, there is inequitable access to certain streaming video content providers based on whether or not the learner is face-to-face or online and synchronous streaming options. While one option would certainly be for online students to watch a documentary or film on their own time, it becomes inequitable in that they have to find time outside of the classroom to do so. Additionally, our library is currently a standalone institution, without a discovery layer in its system. We are in the process of migrating to ExLibris’s Alma and are hopeful that their electronic resource management

opportunities will better connect our users to more streaming video content as well as detail any lease/access options for individual titles within each video record.

Stephanie Kaceli (CU): Not all collections are available in ERM KBs so for smaller libraries, managing MARC records is not sustainable. The difficulty of downloading some eBooks makes it tough for the less technology-savvy person.

The world of eBooks is so different from that of eJournals and students, particularly, do not understand why they are so different. Just trying to download an eBook can be headache inducing. Whenever I need to use Adobe Digital Editions, I think to myself, “no wonder students are so frustrated and prefer the print text over the electronic.” DRM-free is the gold standard for us but that is neither fiscally sustainable nor do all publishers offer DRM-free licenses.

How do you collaborate with users to meet their needs? Are acquisitions models considered in this collaboration?

Keani King (UTC): All patrons are able to make requests for materials and we try to purchase as many requests as we can. These requests come in as direct purchase requests or interlibrary loan requests. If patrons ask for a particular option (print, eBook, audio) we try to meet that need, but sometimes budget restraints do not allow us to do that. We typically look to acquire eBooks with unlimited access and no DRM for use in courses or campus reading groups. We purchase perpetual access through Overdrive, so it is typically a one user one title opportunity. Again, this is the same model we provide for print–so it works for us.

Victoria Turner (ECC): Each semester, I meet face-to-face with a department at department meetings to promote library resources. This interaction provides me with an opportunity to listen to their needs. Nursing and other health science programs look for the currency of video content rather than delivery method; other departments, such as History, prefer longevity of access to a specific documentary. This is why I have worked to increase my video budget over the years, so I can meet various needs through a variety of content providers.

What role do you see open access models playing in content acquisition for your library?

Katie Gohn (UTC): At this point, open access models are primarily impacting our acquisitions as they relate to journal content. We are regularly investigating both the acquisition and creation of open access resources for our faculty and students. Changes in publisher business models are really pushing open access to the forefront; however, it is not always at a clear benefit to the library or researcher. More transparency in the business models aligned with open access publishing is needed so that everyone better understands the tradeoffs and benefits when entering into new licensing agreements. We will always look to provide access to Open Access materials, but not all Open


Access content is worthwhile and it can overwhelm discovery systems if we aren’t careful.

What are the potential benefits and drawbacks of prioritizing open access resources, and how do you balance this with the need for diverse and specialized content?

Stephanie Kaceli (CU): Cairn is still waiting to see how this all shakes out and, since we have limited funds, and even more limited staff, and no guaranteed way to forecast the future, we are hesitant but not afraid to invest some money into some healthy OA models. This model works if libraries continue to invest in projects, but what happens during budget cuts? Cairn does invest in OA offers that opens the back catalog to our community but we are highly selective as there are only so many dollars in the budget. For projects in which we are not investing, we are careful what we add to our discovery because of the uncertainty of continued access. Again, the risks are not just with the budget but managing such collections/titles with a small staff. For us, it comes down to finding what seems to be more sustainable models and guarantees access if OA is pulled. We sure could use a palantír!

Katie Gohn (UTC): Prioritizing open access means that more people can access important scholarly information and literary works with fewer barriers. This has the potential to open up information to researchers and learners across the world; however, that isn’t necessarily what is happening at the moment. For libraries, open access content, at times, can come in large packages that are typically not the most current. This isn’t always the case, but we’ve found that huge collections of “out of copyright” titles available through large Open Access services often are not easily usable by our patrons (the interfaces aren’t great) and there can be a lot of duplication.

Another drawback to prioritizing OA content when it comes to journals and scholarly publications is that OA may be more popular in some areas and not others. This makes the OA collection unbalanced. Researchers also have to be aware of the esteem of journals they are publishing within. OA titles can sometimes be smaller and more obscure — so researchers are uninterested in publishing in these journals as their impact factors may be low. Because of this, researchers who want to publish in bigger OA journals must pay large APCs to publishers in order for their materials to be available OA. In the end, the libraries and institutions are still bearing the burden of funding scholarly publication — even in OA form. Again, more transparency in the industry as well as more OA access in newer and popular areas would be highly beneficial. Until this happens, we will continue to prioritize our acquisitions based on our collection development policy, which meets our primary user curricular and learning needs. There will be a mix of OA and traditional publishing. We are always looking for diverse and specialized content as we serve a broad range of users across our university.


This two-part conversation gives voice to how librarians use a variety of access models to execute short- and long-term collection strategies. Many academic library directors that work with OverDrive have developed what is becoming an essential expertise managing lending models, optimizing how they serve their campus communities. They have learned that additional access options make it easier to meet both faculty and student needs, extending a library’s reach in a way that is cost-effective and sustainable.

The librarians that participated also called out several other factors in the digital sphere that, if managed well, can drive success. Such factors include the importance of bibliographic records, having equitable access to streaming video content, as well as how they use Open Access (OA) resources. OA continues to play a key — and growing — role in meeting the needs of higher education. In addition to the well-known depth of its catalog, OverDrive is actively adding quality OA titles and suppliers while monitoring feedback. Indeed, digital format options are no longer optional; they are the baseline for how content is accessed and made available.

But it’s not all about form, of course, in the academic environment. The mandate that academic libraries require is twofold: first, to offer a variety of format options for efficiency and sustainability, and second, to provide access to the content that is needed by students and faculty. The choice cannot be either access or affordability. Service providers must offer a path that does not come at the expense of one or the other.

Thank you again to the librarians who contributed to this series of emails. Their experiences and insights were invaluable and contribute greatly to the deeper understanding of this important and emerging topic:

• Victoria Turner, Technical Services Librarian, Elgin Community College (IL)

• Katie Gohn, Head of Collection Services, University of Tennessee at Chattanooga

• Keani King, Collections Specialist, University of Tennessee at Chattanooga

• Al Salatka, Interim Director of Acquisitions and Content Management, University of Tennessee at Chattanooga

• Stephanie Kaceli, Dean, Educational Resources, Cairn University (PA)

The Digital Toolbox — One Size Does Not Fit All: Part 1 of 2, “How Libraries Use a Variety of Lending Models to Optimize Access and Affordability of Digital Book and Video Collections” appeared in Against the Grain, v.36#2, April 2024, pp. 46-49.


Biz of Digital — Pilcrow: A New Open Peer Review Platform

Kelly Sattler (Project Manager for the Libraries’ Strategic Projects, Michigan State University Libraries, 366 W. Circle Dr. #18, East Lansing, MI 48820; Phone: 517-884-0869) <>

Column Editor: Michelle Flinchbaugh (Digital Scholarship Services Librarian, Albin O. Kuhn Library & Gallery, University of Maryland Baltimore County, 1000 Hilltop Circle, Baltimore, MD 21250; Phone: 410-455-3544) <>


Approximately four years ago, the Michigan State University (MSU) Libraries formed a collaborative unit, Mesh, with the College of Arts and Letters (CAL) that focused on advancing digital scholarship. Mesh embodies the Libraries’ values of collaboration, stewardship, and expertise. It also supports our strategic directions for being the “center of activity and engagement” and “leading in strategies for open access and scholarly communication.” I joined the unit representing the Libraries as we started with two projects — deploying an institutional repository, MSU Commons, and building a double open peer review application, which we have named Pilcrow.

Dr. Christopher P. Long, Dean of CAL, originally had the idea for Pilcrow many years before and had started the process of developing it to support the Public Philosophy Journal and their collaborative community process. The journal operates with the core values of “Thick Collegiality,” “Ethical Imagination,” “Diversity, Equity, and Inclusion,” and “Trust and Transparency.” Reviewers are encouraged to develop trusting relationships with authors and to imagine how their comments and suggestions can enhance the work and enrich the scholarship. He was able to secure the funding from the Mellon Foundation that enabled us to pursue the work to make Pilcrow a reality. Pilcrow is an effort to put the vision he had outlined in the original published essay of the Public Philosophy Journal into practice.

The motivation was to develop a supportive and values-based review application in support of a collaborative community where the participants are working to improve the material focused on the values of the publication. The application was founded on creating supportive experiences for authors, reviewers, and editors, and provides an environment for transparent, collegial, and collaborative peer review practice.

Development Practices

The Pilcrow development team decided to use Agile/Scrum methodology for managing the project. The team assumes a shared responsibility for the work which we capture in user stories. These stories are translated into actionable work tickets that are then prioritized in the product backlog. We are using the Zenhub plug-in on top of GitHub to track our tickets. Each team member then pulls in the tickets, ideally from the top of the prioritized product backlog, they will work on for the sprint. A sprint is a set timeframe that the team will focus on a smaller set of work to add specific functionality. We started with a typical two-week sprint, but with our team being small and most people having other responsibilities beyond the Pilcrow development team, we expanded it to four weeks to have greater impact.

We do hold the typical Scrum meetings which include a team retrospective to review what went well and what we might want to try differently to improve any aspect of the team or

development, a review meeting with the stakeholders to share what we’ve accomplished over the sprint and to aid in direction setting for the next sprint, and planning meetings to determine what to work on for either the next sprint or the next release. We set our release cycle to a semester during which we focus on a particular feature or function to develop in the application. We also hold regular stand-up meetings to discuss progress and help any team member who may be blocked or stuck. We also do asynchronous check-ins with the larger Mesh unit on a daily basis using the application Range.

The team has changed over the four years, but typically we have one full-time developer and three part-time developers where one of the three developers has a focus on user experience. I have the role of Scrum Master, the person who coordinates the meetings and cares for the team’s health. We have prioritized accessibility by using axe-core, a free open-source, JavaScript rules library, for automated testing. Pilcrow’s code is open source and stored on GitHub.

Application Roles

Pilcrow has a hierarchy of roles, which include:

An “Administrator” handles the system/platform tasks. They most likely did the installation of the software and got the instance of the application running. This role isn’t really concerned with the content, even though they have access to it, but more in keeping the application running. They send the invitation to join the instance to the Publication Administrator.

A “Publication Administrator” is the person responsible for setting up the Publication layer. They’ll also be the person who invites someone to be an editor of a publication within their instance.

An “Editor” manages the process for a publication, such as a journal. They’ll likely be the one to determine if a submission is accepted or rejected initially based on the publications’ criteria and will likely make a judgment after the review is complete as to whether to ask for a resubmission or to move forward with the inclusion of the submission in the publication. They will likely assign the Review Coordinator and possibly reviewers.

A “Review Coordinator” is responsible for managing the review process such as inviting reviewers, enforcing deadlines, answering questions from reviewers and authors, and ending the review cycle.

A “Reviewer” is the person assigned by the review coordinator or editor to engage with submissions under review, provide feedback, and discuss them with the review team. The reviewer is invited to the system and can accept or deny the invitation to participate in a review. They also have the option to leave


a holistic response concerning the entire submission and to extract their work from Pilcrow for evidence of their own scholarly service engagement.

A “Submitter” should be the author, who has submitted a piece of work and participates in the review. The submitter can draft their work in Pilcrow, copy and paste from another application, or upload a variety of document types. Pilcrow supports a wide variety of file types including csv, .docbook, .docx, .epub, .html, .latex, .md, .odt, .rtf, and .txt.

Pilcrow’s Workflow

This is how I would envision people using Pilcrow…

A person or group of people decide to use Pilcrow to perform reviews of text. This could be for a publication, a class, a blog, or for another reason. A person with a bit of technical skill becomes the administrator and using the code from github, sets up an instance of Pilcrow hosted by the group. The administrator assigns the role of Publication Administrator to the appropriate person, who will receive an email that will direct them to the application. They then create an account or use their Google or OrcID account to log in. Each instance may host multiple publications.

The Publication Administrator proceeds to assign one or more editors and to configure the publication within the Pilcrow system. This would entail providing a name and description for the publication, indicating whether it would be publicly visible or hidden, and whether it is currently accepting submissions (open) or not (closed). They would also provide up to six core values that the publication holds referred to as style criteria. The reviewers can associate their comments with these criteria. For the Public Philosophy Journal mentioned earlier, this is where their four values would be listed.

Now, with a functioning instance and publication, let’s walk through the process of submitting an item for review. After logging in, a person can click on submissions in the main menu. On that page, they will see, minimally, an opportunity to submit a new item for review. There is a drop-down menu that enables them to select which publication they want to submit to. If they’ve already submitted an item, those will be listed on this page as well. At the bottom, all the latest comments that were associated with any of their submissions are shown with the most current ones being displayed first. If they happen to be an administrator, they will also see all the submissions that currently exist in the application.

When creating a new submission, the person will need to enter a submission title and agree to the guidelines and review process for the publication. At this point, the system is tracking a draft of the submission and the person will be prompted to add content. The content can be uploaded from the person’s computer or manually entered by copy/paste or typing in text into a basic text entry box. After the content has been entered, the person can preview the submission, continue to edit it, or when ready, submit it for review with or without a comment to the editor concerning the submission.

After a submission is received, any of the editorial or administrative roles need to review the submission and if accepted, change the status to open the review. At the moment, this is a two-step process, but there are plans to reduce it to a single step. The person reviewing the submission also has the option to request resubmission or reject the submission. Once a submission has a status of “Under Review,” any role associated with that submission can begin performing the review.

The review page offers a few features, such as light and dark modes, different fonts, and sizing of the text. Also, highlights and the inline comment tray can both be displayed or hidden with a simple click on the icons at the top. As a person reads the submission, they can highlight the text and a callout box with a “+” in it will appear. By clicking on the plus sign, they can add a new comment concerning that text. They have the option to associate their comment with any of the values/style criteria for that publication. Other reviewers or the submitter can reply to comments. We are hoping a conversation can happen leading to an improved document. Finally, this page also hosts the Overall Comment section, where reviewers can leave their holistic thoughts about the piece.

You may be interested to know that I used Pilcrow to review this very article. I invited my teammates and colleagues to review the article before I submitted it to Michelle Flinchbaugh, the Biz of Digital column editor. The system is currently in beta and we’re continuing to work on adding features and fixing glitches. If you are interested in playing around in Pilcrow, we have set up a demo site at If you have ideas for improving Pilcrow, please comment at https://feedback.


ATG Special Report — Forty Years at the Big Library

Ifirst stepped inside the Thomas Jefferson Building of the Library of Congress in April 1983. I was fresh out of library school and nervous, having flown in for two days of Library of Congress Intern Program selection interviews that were to commence the next day.

The magnificence of the Great Hall somehow heightened my anxiety. Then, I found myself standing in front of a large wooden display case that held the Library’s copy of the Gutenberg Bible. As I gazed at it and thought about the history of the book course that I had recently completed, I was overcome. My knees started shaking, and I had to find a place to sit and regain my composure. I pulled it together for the interviews and earned an invitation to participate as a paid intern in the upcoming twenty-week program of rotational work assignments and orientations. There was no guarantee of a permanent position at the end of the internship. Nevertheless, my wife Barbara and I decided to take the risk, leaving our jobs in Florida and moving up north. I am glad that we did. Those twenty weeks have now morphed into over two thousand weeks. And, believe me, I have seen and experienced a lot.

After the program, I was fortunate to be appointed to a reference librarian slot in the Serial and Government Publications Division. My cubicle featured an IBM Selectric typewriter and a grey metal desk that had the look and feel of World War II era government surplus. There was one computer in the division — a Compucorp terminal that was used by the Chief’s secretary to produce letters and other formal documents. Everyone else was left to use typewriters to create documents and to employ correction fluid to fix errors.

Within a couple of years, the division had received its first personal computer, and it was in such demand that a memorandum was sent to staff with the subject line, Guidelines on the Use of PC Computers. It stated that the one PC should be used “… for tasks that will make the most efficient use of staff time and machine time and effect the production economics for which the machine was designed and which justify operational costs.” It went on to direct that the PC could be used to create bibliographies and finding aids, but it was not to be used for routine correspondence. Of course, it wasn’t all that much longer until PCs supplanted typewriters across the Library.

As a Newspaper and Current Periodicals Reading Room reference librarian in the 1980s, I typically assisted readers to find article citations by suggesting standard printed sources for them to consult. The emergence of online resources such as DIALOG flipped the searching responsibility to the librarian. Patrons in the reading room did not have direct access to the online resources. Instead, the librarian had to serve as an intermediary, signing on in a back room via dial-up (using an acoustic coupler and hearing that electronic screech each time a connection was made), doing the carefully designed search (so as not to waste costly online time) and printing out a list of citations on thermal paper. It would be a while until reading room users would have direct access to online databases, although CD-ROM indexes were increasingly available to them in some disciplines.

Access to online information took a giant step forward with the 1993 launch of LC MARVEL (Library of Congress Machine-Assisted Realization of the Virtual Electronic Library), using gopher software from the University of Minnesota. It provided text-based information about the Library and its services in addition to offering “easy access” to textual resources available over the Internet. Library staff were provided with these instructions:

To access it, telnet to and login as marvel This will connect you to a “generic” Gopher client. Initially only 10 ports will be available to outside users for direct telnet connection.

Although gophers had their moment, they were soon replaced by browsers such as Mosaic that truly opened the gates to the World Wide Web.

While technology was moving forward, the Library faced recurrent funding issues, hitting a low point with budget reductions resulting from the Balanced Budget and Emergency Deficit Control Act of 1985, informally known as the GrammRudman-Hollings Act. Its purpose was to reduce the Federal budget deficit and to eventually produce a balanced budget. When fiscal belt tightening takes place, it is just a matter of time until individuals are impacted. That’s where I entered the picture. In early 1986, reductions hit the Library, and Sunday reading room service was eliminated, as were some of the evening hours of opening. At the same time a “reduction in force” was put into place. My division chief handed me a letter stating that my position was being abolished and that I was eligible to be placed in another position in the institution. It would be at a lower grade, but I would have pay retention at the higher grade for a temporary period. So, off I went to the Congressional Research Service.

Meanwhile a group of external activists began a “Books Not Bombs” campaign to protest the Library’s reductions in service. Multiple protests took the form of reading room sit-ins that led to a number of arrests. Eight of the individuals were eventually found guilty in DC Superior Court of unlawful entry.

Funding was restored to the Library a few months later, and I was offered back my previous position at my higher grade. I gladly accepted.

By the 1990s, I was the Public Service Officer in the Collections Management Division (CMD), which placed me in the middle of a collections security crisis. For at least a decade, concerns about thefts and mutilations of collection items had grown. In 1988, a researcher was sentenced to three years in prison for transporting documents stolen from the Library and from the National Archives. Meanwhile, evidence accumulated of thefts and mutilations of books (particularly art titles) in the General Collections stacks in the Jefferson and Adams Buildings, which traditionally had been open to all staff and to researchers who received a pass from a reference librarian. In 1991, a radiologist


was charged (and eventually pled guilty) with stealing $40,000 worth of books from the Library and the University of Maryland. Later that same year, an attorney from the General Accounting Office was arrested with ten manuscript documents worth $33,000 in his pocket, although the full extent of his thefts was much greater than the ten items. In 1992, a book dealer from Alexandria, Virginia, was arrested after a Library investigator observed him ripping pages out of a book and was found to have two maps under his sweater.

Decisive actions were taken to protect the collections. The General Collections stacks were closed to all members of the public and then to Library staff except for those who needed access to perform their work duties. At risk materials, such as folio art books, were caged. Theft detection gates were installed at the exits of all three Capitol Hill buildings, and theft detection targets were installed in hundreds of thousands of volumes in the existing collections. Surveillance video cameras were installed in reading rooms and collections storage areas. Researchers could no longer carry collections materials between reading rooms. These and many other measures served their purpose, but they were not welcomed by certain researchers and staff who had their access to materials restricted.

On the pleasant side during the ’90s, the Library’s Little Scholars Child Development Center opened in September 1993. Our 15-month old daughter Katy was enrolled in the inaugural class, and we commuted together daily from Maryland, while she was a Little Scholar. (Katy now has a library and information science degree from Drexel University and works at the Washington Research Library Consortium.)

Without a doubt, the most upsetting day I experienced at the Library was Tuesday, September 11, 2001. I had just returned with a cup of coffee to my desk in the CMD office in the Jefferson Building when the first reports came in. One of the stacks surveillance monitors in the office was switched to a live television channel, and we watched the developments with increasing worry. When word came that the Pentagon had been hit, I went up to the second floor of the Great Hall and looked out the windows facing west past the Capitol and across the Potomac River to Virginia. The smoke was very visible, and it was frightening.

An evacuation of the Library began shortly after 10:00, which was hampered by technical issues with public address communications. When I did leave that morning, the area near the Library was somewhat chaotic — there were people everywhere, hustling in various directions. Senator Patrick Leahy, followed by a number of individuals I assumed were his staff, moved quickly by me. I had heard rumors that there were bombs in the Metro system. So, I didn’t want to travel home that way. Luckily, I caught a ride out of DC with a colleague. Of course, that day led to all types of security enhancements that are routine parts of our lives today.

On October 1, 2017, I began a 120-day detail as acting Associate Librarian for Library Services while my boss was detailed to serve as acting Deputy Librarian. In all honesty, I did not want to do the detail. I had been quite happy and felt fulfilled in the preceding few years serving as the Collection Development Officer. I knew that serving as a mega-administrator would not be very satisfying. But I was asked, and I said yes. Much as my twenty weeks at LC turned into two thousand weeks, my 120 days on detail grew into a year and a half. It was a relief to me when in April 2019 I was free to go back to doing collection development work.

Friday March 13, 2020, is another date to remember. We had been hearing the dire warnings about this new COVID-19 disease and how it was about to sweep the country. An all staff message that day instructed, “Starting on Monday, March 16, employees who have work assignments that are appropriate for telework and who are capable of teleworking will telework for as many of their daily work hours as possible, even if they do not regularly telework.”

So, I packed up some papers and other materials that would be needed immediately and planned to work from home for a couple of weeks. Little did I know that it would be two years until the Collection Development Office staff and I came back onsite on a regular basis. During the fulltime teleworking, we somehow became closer and were able to be productive owing to the nature of our work. Nevertheless, I longed to see people, especially Library of Congress people, in person again.

When we finally did return in 2022, I found on my desk, my Page-a-Day Cats calendar still displaying Friday, March 13, 2020, with a photo of a black cat and that day’s inscription, “Though many Americans see black cats as unlucky, some parts of the world consider them good luck.”

One morning recently, before the Jefferson Building doors opened to visitors, I went to the Great Hall to once again gaze at the space that overwhelmed me in April 1983. It is still magnificent! And I stood in front of the Gutenberg Bible, now housed in a sleek new modern case. It was just the two of us, with me a bit more experienced than upon our first meeting. This time, it was more like seeing an old friend. My knees remained steady.


ATG Interviews Peter Potter

Director of De Gruyter eBound

Our editorial team had the chance to catch up with Peter about the recently announced launch of the University Press Library Open initiative.

ATG: Can you tell us about University Press Library Open (UPLOpen)?

PP: Sure. If I had to distill it down into a few words, I’d say that UPLOpen is an ambitious effort to accelerate open access eBook publishing among university presses. The most visible piece of UPLOpen is our website — — an eBook platform hosted by Ubiquity that offers a curated collection of open access scholarship from university presses with a central focus on achieving the United Nations Sustainable Development Goals (SDGs). When you dig a bit more deeply, though, what I find most exciting about UPLOpen is its potential to serve as a model for sustainable open access monograph publishing. Here’s what I mean by that.

UPLOpen also depends upon collaboration with other key players in the OA landscape. For example, Ubiquity has been working for years now with OPERAS on developing the OPERAS Metrics platform, which collects usage and impact metrics from many different sources and allows for their access, display, and analysis from a single access point. This technology, which we are integrating into UPLOpen, will enhance our ability to provide publishers with consistent book-level metrics.

Similarly, we view collaboration with OAPEN as crucial to the long-term success of UPLOpen. OAPEN, of course, plays an essential role in every aspect of the open access book ecosystem, from dissemination and metrics to quality assurance and digital preservation. This is why Brian Hole and I were eager to meet with Niels Stern and Laura Wilkinson at the London Book Fair in March to discuss ways that UPLOpen can support the work of OAPEN in the future.

UPLOpen builds on the foundation of De Gruyter’s University Press Library (UPL), a digital library containing eBook collections from dozens of the world’s leading university presses and publishing houses. Libraries that participate in UPL purchase these collections under favorable licensing terms and a streamlined acquisition process. UPLOpen introduces an OA component to the UPL model in a way that takes full advantage of this streamlined process. Until now, OA book programs have largely been “something separate” and different from the core of library collection development. One of the goals of the UPLOpen model is to simplify library support mechanisms for OA publishing by unifying the acquisition of open content with paywalled content. The potential for scalability becomes clear when you consider that UPL has grown to comprise 50+ publishers and over 150 academic libraries.

ATG: How does UPLOpen fit within the open access landscape? How does it differ from other open access book programs, and do you see it as a competitor to initiatives such as OAPEN?

PP: There are, of course, plenty of places on the web where you can go to find open access eBooks, but many of the existing sites include a mixture of open access and paywalled content. All of the books you will find on are open access. Moreover, the platform itself — built by Ubiquity — is designed to optimize the benefits of open access. One of the things we are most proud of with UPLOpen is the fact that it is designed to fit into the growing open research infrastructure. Not only is the underlying technology open source, it also incorporates other important components of the open infrastructure including Orcid, CrossRef, DataCite, and, all of which enhances functionality and user experience. This is very much in keeping with Ubiquity’s core commitment to open access and open scholarship in general.

The final point I’ll make about UPLOpen’s place in the open access landscape is that I believe we’ve created a great experience for users. Our goal from the beginning was to create a welcoming place where people could go to find freely available, authoritative research from many of the world’s top university presses without the distractions they find elsewhere on the web. We want visitors to hang out and explore the site as they would in a bookstore or a library. When they find a book they’re interested in, they can read it directly on the platform or download it in multiple formats. And if they want to own a print copy, we link directly to the publisher’s website or any commerce site the publisher chooses. The focus of UPLOpen is on the presses and, whenever possible, we want to drive traffic back to the presses in the hope that people will explore their sites and buy books that they would otherwise buy from Amazon. In the long run, I’m hoping that UPLOpen, with its growing amount of content, will become a proving ground for the proposition that OA can drive sales.

ATG: What is DeGruyter’s relationship to UPLOpen, and how is it related to the University Press Library (UPL)?

PP: As I mentioned above, UPLOpen grew out of De Gruyter’s University Press Library (UPL) program, which is a collectionbased model for libraries to acquire eBooks from university presses. As the OA counterpart to UPL, UPLOpen provides a single destination for open access books published by UPLaffiliated presses. When a library purchases a UPL collection, a percentage of the total amount is allocated for open access publication, and those funds go not to De Gruyter but to De Gruyter’s not-for-profit foundation eBound, where they are then made available to UPL member presses to publish open access books.

In short, eBound is the financial hub of UPLOpen. As a 501 (c) (3) public foundation, it oversees UPLOpen’s funding and


coordinates UPLOpen’s multiple stakeholders. In addition to funds collected from the sale of UPL collections, eBound will generate additional revenue from other sources (including institutional memberships, one-time donations, and more) that collectively will give participating presses an ongoing, reliable funding stream to grow their OA book programs in years to come.

As the new executive director of eBound, my task is to further develop the UPLOpen business model in consultation with the eBound advisory board, which consists of leading figures in the academic library and scholarly publishing community. Throughout this process and beyond, I’m eager to be as transparent as possible so that we build trust in the community in keeping with our transparency pledge on the website.

ATG: According to the press release, “At launch, UPLOpen. com proudly hosts more than 350 open access books from over thirty university presses…” That’s quite impressive! Can you tell us about this development and how the different university presses became involved?

PP: The majority of the 375 books currently on UPLOpen come from two existing collections of open access books: Luminos and TOME. We started with these two collections for a few reasons. First, the UPLOpen platform has been, until recently, in beta testing, so it made sense to begin with a couple of discrete collections as a way to test the technology.

Luminos was a natural for the site because it was one of the first truly innovative OA books initiatives and it is owned by the University of California Press, which is a UPL press and a Ubiquity partner press. We’ve also benefited greatly from Erich van Rijn, of UC Press, who has generously offered feedback as we’ve been developing UPLOpen.

The reasons for including TOME were a bit different. Before moving to De Gruyter in 2023, I had direct experience working with TOME and its sponsoring organizations (AAU, ARL, AUPresses), so I was deeply familiar with the initiative. And although the pilot ended in 2022, dozens of funded books were still to be published with no guarantee that they would all find their way into a single collection where they could be discoverable as TOME books. So, we made a commitment to include the complete collection in UPLOpen, many of which are published by presses that are not part of UPL, which explains the presence on the site of books by non-UPL presses. The total number of TOME titles currently in UPLOpen stands at 185. We’ll add the remaining titles (probably 20-30 more) as they are published, until the collection is complete. In addition, we will make sure that all of the titles are added to OAPEN as well.

ATG: One of the unique aspects of this initiative is the focus on incorporating the UN’s Sustainable Development Goals. How are the SDG’s incorporated into the book catalog? How are the badges assigned to show SDG alignment?

PP: We’re excited about the integration of the UN Sustainable Development Goals into the platform because of the invaluable role that academic publishers can play in facilitating implementation of the SDG objectives through both institutional practices and the works they publish. The way we’ve featured this on the website is to assign badges to books based upon the SDGs that they address. For instance, on the website you’ll find a book entitled Advancing Equality published by the University of California Press, and you’ll see that we’ve assigned 4 badges to the book corresponding with 4 specific SDGs. We used AI to identify the relevant SDGs, and the Ubiquity team developed the badge feature to highlight them. I should add that we will

continue to refine our use of AI over time, iteratively, based upon feedback we receive from authors, readers, and the research community.

I also want to point out that the newest book added to UPLOpen is Redefining Development, written by Paula Caballero and Patti Londoño, two key players in the development and adoption of the SDGs. The book, published by Lynne Rienner Publishers in 2022, was just flipped to OA with funding from eBound. It is a natural for UPLOpen given the authors’ intention in writing the book: to incentivize innovation and boldness in how the SDGs are implemented.

ATG: Also from the press release, “By mid-2024, the number of titles hosted on is expected to exceed 2,500, with further plans for significant growth already in motion for 2025 and beyond.” Can you talk to us about your roadmap for the future? What are your next steps and plans for expansion?

PP: We have an ambitious development plan for UPLOpen, which we’ve sketched out on the website. I will just say here that our goal is to continue to improve the platform, adding both content and functionality over time. We will do this in consultation with stakeholders from across the scholarly publishing community.

With respect to content, our next step is to add more books to the site including over 2,000 OA titles currently on De Gruyter. com. Next year, as we generate revenue for UPL presses, we will begin adding their born-OA books to the site. Meanwhile, over the next several years, we will be looking for opportunities to add more OA books from presses outside of UPL. Our long-range goal is to have 10,000 OA books on UPLOpen by 2025.

With respect to functionality, one particular area of priority for us is to improve search capability, which is rather basic at the moment. We’ll be adding an advanced search function that will enable users to search based on subject matter, genre, author, publisher, etc. Over time, we have plans to incorporate AI-powered semantic search algorithms, which will improve the ability of the search function to “understand” context and meaning behind user queries. We also look forward to experimenting with AI to verify and improve alignment of eBook content with UN SDGs and to enhance accessibility features for users with disabilities. Other items on the roadmap include the addition of MARC and Kbart records for download by title and by batch, title lists for download, further development of the OPERAS Metrics platform to improve book-level usage metrics, and the integration of a news/social media feed.

ATG: Peter, thank you very much for talking with us and our ATG readers! We’ll be sure to keep an eye on for updates.

Peter Potter is Vice President at Paradigm Publishing Services and Executive Director of De Gruyter eBound. His previous roles included senior publishing positions at Virginia Tech, Cornell, Penn State, and Wesleyan. During his time at Virginia Tech, Peter served as a Visiting Program Officer for TOME (Toward an Open Access Monograph Ecosystem), a 5-year pilot project of the Association of American Universities (AAU), the Association of Research libraries (ARL), and the Association of University Presses (AUPresses).



Shawn Averkamp

253 36th St., Ste. C309 Brooklyn, NY 11232 <>

FAVORITE BOOKS: Too many to choose, but recent favorites are Bliss Montage by Ling Ma and All Fours by Miranda July

HOW/WHERE DO I SEE THE INDUSTRY IN FIVE YEARS: Libraries will be more important than ever in curating sources of truth and teaching information literacy, but the vehicles and venues for these services will be something we haven’t yet imagined.

Peter Brantley

Director Online Strategy Library, University of California Davis 1 Shields Avenue Davis, CA 95616 <>


IN MY SPARE TIME: Read, hike, cook.

FAVORITE BOOKS: Gravity’s Rainbow by Thomas Pynchon; JR by William Gaddis.

Brandon Butler

Executive Director ReCreate Coalition <>

BORN AND LIVED: Born in Georgia (the suburbs of Atlanta), and have lived in Athens, GA, Austin, TX, Boston, MA, Washington, DC, and now in beautiful Charlottesville, VA.

PROFESSIONAL CAREER AND ACTIVITIES: 15 years of law and policy advocacy focused on copyright and the law’s impact on libraries and archives, higher education, fair use, and the public interest. Highlights include stints at the Association of Research Libraries, the American University Washington College of Law, and the University of Virginia, before turning to private practice and representing ReCreate in 2024.

FAMILY: Married my high school sweetheart (for real) and have two kids, a boy and a girl, who love copyright and libraries.

IN MY SPARE TIME: I’m a runner, a guitarist, and a reluctant renovator with two marathons, one West Coast tour, and two 1950s homes under my belt.

FAVORITE BOOKS: Infinite Jest, Wolf Hall, Our Band Could Be Your Life

PET PEEVES: When people say “fair use is just a defense.”

PHILOSOPHY: Fair use fundamentalist.

MOST MEMORABLE CAREER ACHIEVEMENT: Filing a brief in the Supreme Court and getting to line up for oral arguments in the VIP line (for members of the Supreme Court Bar!).

GOAL I HOPE TO ACHIEVE FIVE YEARS FROM NOW: Seeing my daughter graduate high school and sending her off to college.

HOW/WHERE DO I SEE THE INDUSTRY IN FIVE YEARS: In partnership with universities, libraries, and others to curate and share the best of the world’s knowledge at no cost to the author or the reader.

Lorcan Dempsey

Professor of Practice and Distinguished Practitioner in Residence

The Information School University of Washington, Allen Library 199D, 4000 15th Ave NE, Seattle, WA 98195 <>



FAVORITE BOOKS: Persuasion, The Volunteers and The Heather Blazing

HOW/WHERE DO I SEE THE INDUSTRY IN FIVE YEARS: Libraries will continue to provide access to the means of creative production for citizens, researchers and learners around the world.

Vessela Ensberg

Associate Director, Data Architecture UC Davis Library 1 Shields Avenue Davis, CA 95616 <>

IN MY SPARE TIME: I read, bike, hike and travel.

FAVORITE BOOKS: The Master and Margarita by Bulgakov; Crime and Punishment by Dostoevsky; The Painted Veil by W. Somerset Maugham; The King Must Die by Mary Renault.

Gary Price



Editor, ARL Day in Review 416 E. Franklin, Silver Spring, MD 20901 Phone: (301) 538-3370 <> infoDOCKET ARL Day in Review

BORN AND LIVED: Chicago, IL; Southfield, MI; Vienna, VA.

EARLY LIFE: Graduate of New Trier H.S. in Winnetka, IL; Undergrad From U of Kansas; MLIS from Wayne State University.

PROFESSIONAL CAREER AND ACTIVITIES: Awards from Special Libraries Association. Alumna of the Year, Wayne State Library and Information Science Program.

FAMILY: Husband and Father.

IN MY SPARE TIME: Aviation Enthusiast. Pop Music History.

FAVORITE BOOKS: 1984 and The Jungle

HOW/WHERE DO I SEE THE INDUSTRY IN FIVE YEARS: Still dealing with how we (info pros) add value. Still dealing with how to market ourselves and services. Still dealing with how to best use AI/GPT.


2024 Vendor Showcase

Tuesday, November 12

10:00 am - 5:30 pm

Charleston Gaillard Center

Exhibitor registration now open!


ACMI, Australia’s Museum of Screen Culture

Federation Square Melbourne, Australia

Phone: 0411 534 138 <>

BORN AND LIVED: In Italy, Trieste.

EARLY LIFE: I earned a master’s degree in Electronic Engineering (Trieste University), while I was the founder and president of a cooperative specialized in public administrative services.

PROFESSIONAL CAREER AND ACTIVITIES: I started my career in IT as partner or employee in small and medium IT companies, covering the roles of network and system administrator, until I achieved the position of IT service delivery manager at ACMI. Later, under the guidance of the newly appointed CXO and then CEO Seb Chan, I’ve become the leader of a small team of creative technologists, in charge of developing innovative digital products to enhance the museum visitor experience.

FAMILY: Married with Dr. Joanne Mihelcic.

IN MY SPARE TIME: I study and practice Qigong and Tai Chi, I spend time with friends hiking and visiting galleries and museums, I am passionate about soccer and basketball.

FAVORITE BOOKS: I love Sci-Fi and history books, but I’ve also read almost every book of the Beat Generation, Emile Zola and Fyodor Dostoevsky.

HOW/WHERE DO I SEE THE INDUSTRY IN FIVE YEARS: We are at the beginning of another digital technology revolution, due to the rapid progress of Artificial Intelligence, supported by greater computing capacity at affordable cost. One of the main consequences will be a much larger population with the will and the skills to build digital solutions, due to the simplification in conversing, instructing and controlling the machines.

University of Washington

Allen Library 199D, 4000 15th Ave NE Seattle, WA 98195

Phone: (443) 824-9725 <>

PROFESSIONAL CAREER AND ACTIVITIES: Lucy is an Assistant Professor at the University of Washington Information School, and holds Adjunct appointments in Computer Science & Engineering, Biomedical Informatics & Medical Education, and HumanCentered Design & Engineering. She is also a Visiting Research Scientist at the Allen Institute for AI (AI2) on the Semantic Scholar research team. Her research asks whether AI and natural language processing techniques can help make sense of scientific output and assist people with making better health-related decisions. Her work on COVID-19 text mining, supplement interaction detection, accessible reading, and gender trends in academic publishing has been featured in publications such as Geekwire, VentureBeat, Boing Boing, Axios, and the New York Times. She received her PhD in Biomedical and Health Informatics from the University of Washington.

IN MY SPARE TIME: I enjoy tramping around the mountains and forests of the Pacific Northwest, especially when there is snow on the ground. As many have famously said before me in some variation: “A bad day outside is better than the best day sitting at my desk” and this rings deeply true for me.


The Information School

University of Washington Box 352840

Mary Gates Hall, Ste. 370 Seattle, WA 98195-2840


HISTORY AND BRIEF DESCRIPTION OF YOUR COMPANY/ PUBLISHING PROGRAM: As a leading member of the iSchool movement, the UW is a model for other information schools around the globe. The UW iSchool’s approach to information instruction and scholarship builds on the traditional roles filled by information professionals and infuses this with a strong emphasis on the technologies through which information is increasingly delivered. By tackling key social and technical problems in the information field, the iSchool has become an important link between users of information and designers of information systems, connecting society with the information it needs. iSchool History | Information School | University of Washington

253 36th St., Ste. C309 Brooklyn, NY 11232

KEY PRODUCTS AND SERVICES: Consulting for digital asset management.

CORE MARKETS/CLIENTELE: Higher Education, Cultural Heritage, Non-Profits, Government, Corporations.


HISTORY AND BRIEF DESCRIPTION OF YOUR COMPANY/ PUBLISHING PROGRAM: AVP was founded in 2006 with a focus on furthering the use of technology and digital asset management by the world’s most influential organizations to positively impact humankind. AVP offers platform-neutral consulting services to help organizations get the most from their digital asset management program. Our services consist of strategy, technology selection, technology implementation, program optimization, and research and development.

IS THERE ANYTHING ELSE THAT YOU THINK WOULD BE OF INTEREST TO OUR READERS? To find more valuable resources and articles to help you enhance your digital asset management program and use of AI, check out


and priorities. Giannis Tsakonas (University of Patras, Greece) joined virtually, with interesting statistical analysis to show this paradox.

The conference, however, looked beyond challenges to opportunities, beyond environment to projects and strategies. Of much interest is the movement towards open access publication in Africa, outlined in a presentation by Bastien Miraucourt (CNRS, France), who works in Africa to make open access monograph publishing a reality. He analyzed types of communities participating in and served by African publishing. He made it clear that there are really several Africas, divided by climate and politics. The string of countries down the east coast of Africa, from Ethiopia and Somalia all the way to South Africa, clearly constitutes one (multilingual) community of nations, while traditional West Africa (e.g., Nigeria, Ghana, Sierra Leone) forms another natural region. The central spine of sub-Saharan Africa is more culturally and politically challenged, while the Saharan and immediate sub-Sahran regions are divided from Africa farther south by climate and by religion. The traditional Maghrib on the Mediterranean coast to the north comprises another community of Islamic nations with histories of relatively close ties to their European colonizers.

Miraucourt then outlined the challenges for scholarly publishing in general in Africa, telling, for example, the sobering story of near-total cuts in publishing and library acquisitions at the huge University of Ibadan in Nigeria. He believes that it is nonetheless possible to envision building up again an infrastructure and practice of scholarly publishing in Africa, looking to a time when it will be possible to advance strategies of open access. Presentations by Susan Murray (AJOL - African Journals Online) and Rosalind Hattingh (Sabinet Online) introduced us to numerous African open access journals.

Ros Pyne (Bloomsbury Publishing) offered a different solution: the new Bloomsbury Open Collections, which now include African Studies in the range of a (so far) limited number

of monographs opened by a variant of subscribe-to-open. Their series has the advantage of having real books now available. However, the pilot project has yet to meet its first-year funding goal, so there are fewer open access volumes coming available than Bloomsbury had hoped. The project is also subject to the concern heard in Africa itself, that Bloomsbury are publishing English-language volumes about Africa, written outside Africa. If the dream is to support creation and dissemination of books from, of, and in Africa, there is still a long way to go.

A major South African cultural heritage challenge was described in a sobering presentation by Mpho Ngoepe (University of South Africa) with the engaging title, “It wasn’t raining when Noah built the ark.” His theme was the loss of precious historical records in Africa due to absence of adequate archival priorities, policies, and facilities. He gave accounts of precious records being removed deliberately by departing colonial powerful figures, who didn’t want their misdeeds exposed; and examples of unique documents written by Nelson Mandela — or the autopsy of the late Steven Biko — held by private individuals who then sold them at auction for great amounts of money. In that context, there is a lot of basic work yet to be done. “Let’s build the ark,” Dr. Ngoepe concluded, in a metaphor for the entire Retreat.

Attendees could see in the meeting many prospects for a brighter future. We heard a report from pre-conference attendees at a workshop — run in combination with the retreat- in which young academic library professionals from around Africa joined in a day-long coaching and professional development session. They were an impressive group, fully worthy of succeeding one day some of the elders of the community, who were also with us at the retreat. I left the Retreat knowing I had been blessed with a snapshot of a world region where, although many things need fixing, so much is now in motion to build a bright future, hand in hand with the global north.


Back Talk — It Wasn’t Raining when Noah Built the Ark

The longest overwater commercial flight in the world, the UAL purser called it, the 15 hours from Newark, NJ to Cape Town, South Africa. The flight was a perfect transition to a memorable Fiesole Retreat that took us Global Northerners well out of the worlds we are familiar with.

What changes when you make that long journey? The sun runs over the northern, not the southern, sky and the stars are fresh and bright and unfamiliar. The trees at the retreat hotel on the east side of Table Mountain are lush and not quite familiar. East African tortoises roam the grounds, and at night two Egyptian geese swim in the outdoor pool and tuck into nearby shrubbery for their overnight accommodations.

The social landscape is also subtly distinct from the American. South Africa has many and serious political challenges, with the indigenous population still economically disadvantaged. (I learned an unfortunate new word, “loadshedding,” on the first morning in Cape Town when my wakeup call never came; it seems that even Cape Town suffers regularly intermittent, deliberate power cutbacks to keep the whole system functioning.) There is a subtly different public social mixing of races from what is familiar in the U.S. The difference is hard to describe, but one sees the outlines of a more genuinely and comfortably multiracial society than is often on display in the U.S. We all have a long way to go, but there are different paths.


This year’s Fiesole Retreat (24th) Cape Town location (first time on the African continent) was intentionally chosen to take advantage of those different points of view — for a twoday program devoted to the challenges of connecting global north and south in support of advancing the future of scholarly communication and scholarship. Michele Casalini and his team have built a community of practice, comprising loyal long-time members who are joined each year by an enriching regional group attracted by location and theme. Below are just a few highlights; a full set of slides is being posted to the Fiesole Retreat website:

This year, my role in the community was to pull together the opening session with a series of talks that set the context. Ujala Satgoor (University of Cape Town) provided an overview of the challenges for including Africa in global scholarly communications systems, followed by Biliamin Popoola (University of Medical Sciences in Ondo City, Nigeria) bringing a researcher’s perspective about publishing for a global readership (or not). The remarkable Nokuthula (“call me Nox!”) Mnchunu (South African Research Foundation’s Deputy Director of the African Open Science Platform - AOSP) outlined the impressive strategy and infrastructure now being developed to advance South African official scientific and scholarly communications.

Acquinatta Zimu-Biyela (Professor at the University of South Africa) reflected on the theme of decolonization as lived through the South African experience, providing the larger historical and social context in which African libraries’ work has to go forward.


Toni Nix, Advertising Manger, Against the Grain, Charleston Hub <> • Phone: 843-835-8604

Grain / June 2024

From the initial context-setting, the conference branched into various directions. One large theme was discussion of challenges of less-advantaged nations and particularly of Africa’s hundreds of diverse languages. South Africa is, of course, a very multilingual society. In African nations, concern for the integrity, survival, and contribution of national and regional cultures is strong, and yet English has become the effective dominant language, bringing with it a flood of mass culture from around the world, threatening to swamp local perspectives continued on page 69


Against the Grain has gone digital!


Contact Toni Nix at <> Click the links below for information on how to Subscribe, Submit Content, or Contact Us

About Against the Grain

Against the Grain (ISSN: 1043-2094) is your key to the latest news about libraries, publishers, book jobbers, and subscription agents. Our goal is to link publishers, vendors, and librarians by reporting on the issues, literature, and people that impact the world of books and journals. ATG eJournal will be published five times a year (February, April, June, September, and November) and will be distributed to ATG subscribers, Charleston Library Conference attendees, and registered members on the Charleston Hub.

Find ATG on the Charleston Hub at

Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.