People & knowledge are the keys to breaking down the walls between daily operations and digital preservation (DP) within our organisations. DP is not a...
Blogs
Recent years have seen an ever-increasing interest in developing Data Mining methods that allow us to find structured information of interest in very large collections...
These have been two busy days in Den Haag where Carl Wilson from the OPF tries to show us how to use tools in order...
In my last blog post about ARC to WARC migration I did a performance comparison of two alternative approaches for migrating very large sets of...
For quite some time at The National Archives (UK) we've been working on a tool for validating CSV files against user defined schema. We're now...
This post covers two main topics that are related; characterising web content with Nanite, and my methods for successfully integrating the Tika parsers with Nanite....
