What do you need from a digital preservation tool registry?

I recently spent a few minutes collecting together all the lists, registries and obvious sources of digital preservation tools that I could find:

I quickly managed to compile quite a long list of lists, and I’m sure there are many more useful sources that I’ve missed. A quick scan through them suggests there is some overlap. Most included the usual suspects, such as DROID, JHOVE, FITS and the NZ Metadata Extractor, for example. But pretty much every list had some unique entries that looked worthy of further exploration.

As resources for someone trying to solve a digital preservation challenge, these lists are not doing their job very well. Its not surprising that we see a lot of reinventing of the wheel, as developers and projects tackle problems that they don’t realise someone has already solved, or at least has gone a long way to solving.

Surely we can do better?

The broader challenges here remind me a little of a comment from Andy Jackson on file format registries. He noted that the DP community had described the PDF format in at least 5 or 6 different locations, but the best source of information about PDF is still Wikipedia. By spreading our effort around, and ignoring resources, we waste the opportunity to make a real impact.

Trevor Owens recently re-triggered a Stack Exchange proposal for a DP question and answer site, and I think this has a lot of potential for our community. A single point of entry to the morass of DP related information spread thinly across the web would be incredibly useful. And it may also be a way to kick start a bit more international cooperation and coordination. I’ll blog a bit more about this another time…

Focusing our community’s effort in a more coordinated manner would clearly be beneficial to us all. But is this a realistic aim? There are a variety of challenges to overcome. We need central locations that users from different institutions and organisations are willing to visit and contribute to. Or nifty solutions to link, share and combine scattered resources so they can be exploited as a whole. We need the right functionality to meet the needs of the users. And we need sufficient critical mass to make the resource(s) a success. Hopefully a self supporting success that is moderated, updated and grown via the efforts of its users.

Bringing it back to the specifics of the Tool Registry challenge: I proceeded by having a detailed look through the tool lists I’d collated. Many were pretty basic, some were very out of date (or completely abandoned). Some were quick lists of URL’s with no more annotation than a few simple notes. Some had a lot more detail and were classified with tags. Coincidentally, colleagues at the DCC chose the same week to update their catalogue of tools and services, which is well worth a look. Clearly they’ve put a lot of work into it as the result is rather polished and has some richly annotated content. The categorisation by user and lifecycle stage seems very useful, but there aren’t many of these handy groupings. Despite the fact they’ve asked for feedback and contributions, a user can’t directly edit or contribute to it. Taking a very different approach, the OPF registry (to which I’m a little biased, having worked with it before) is wiki based and has had some success at engaging the community in building the resource. A particular aim is to link information about tools, with the (positive and negative) experiences of users in trying them out. This evaluative information seems to be critical. But despite this, its certainly a long way from reaching a critical mass of users.

Regardless of the different approaches, it appears that all these lists and registries are broadly trying to achieve the same thing: support users in finding the best tools to solve their preservation challenges. If it was possible to combine them and focus our collective effort in one place, the end result would be much more useful.

What are the specifc requirements to meet this aim? These are the the ones that spring to my mind:

  • Provide easily browsable tool lists with one line summaries
  • Tools should be organised or categorised in a useful manner. Filtering by kind of user, purpose or subject is essential to make browsing focused. Tagging seems to be a good way to do this
  • Should be user editable to reduce the maintenance load and enable the community to be directly engaged in enhancing it
  • Should be possible for users to add experiences of applying a tool (eg. this tool works well in situation x, but not situation y)

Am I missing anything important? Can we find a way to pool our resources more effectively? And most importantly: What do you need from a digital preservation tool registry?

