User story: archiving a FoxPro database
As a first encounter with database archiving, FoxPro is a nice challenge. This is the story of our first database archiving attempt.
The archive institution received a PC as part of a journalist’s archive and amongst other things there was a custom-made database application installed on it. While the PC was still working, it was pretty old and looked like it could die any moment. A Ghost image was created A.S.A.P. but the image was created with a version of Norton Ghost that is not recent enough to convert it in a virtual machine, but we can explore the image. The custom application is created with FoxPro, but the application’s installation program is no longer available. Any attempt we made to try to get it to run on another (virtual) machine failed. We were clearly missing some libraries in obscure places or some registry entries. Given that it is so difficult to get the application to run it was quickly clear that we needed to convert the database into another format for archiving and presentation purposes.
Our first attempt was to get the database loaded into Access, but that was only a partial success. The table relations were nicely recovered and most of the data got transfered nicely into Access but we had issues with character set and datatype conversions. The process requires the installation of a FoxPro ODBC driver which may be a challenge on its own when working in a 64-bit Windows environment. If you are a happy owner of a recent machine with a Windows 64-bit OS and want to convert older databases we strongly suggest you get a virtual machine running 32-bit Windows XP. It will make your conversion life much easier. Windows 7 Professional, Enterprise and Ultimate users can download a XP virtual machine for free at http://www.microsoft.com/windows/virtual-pc/download.aspx. If you decide to live dangerously and go with your 64-bit Windows, here’s a tip: the ODBC driver is 32-bit and is not visible in the ODBC Administration tool. Start C:\Windows\SysWOW64\odbcad32.exe instead. Good luck!
In the mean time we felt disappointed with the FoxPro to Access conversion. It’s hard to setup on a recent PC and the results we got are not satisfactory. The database contained some memo fields. Some of them are pretty large and contained diacritical characters. During inspection of some of the converted data we suspected parts of it were lost. But the main problem we faced was the number of steps we needed to take to get the job done because our goal is to automate the ingest procedures as much as possible. It is also not possible to get a less digital experienced user at the archiving institution to repeat our steps without a lot of help.
At the OPF Dabase Archiving Hackathon we learned that MS SQL server would make things much easier as it has support for importing the data built-in. We were very happy that our case was used for one of the breakout sessions. Installing MS SQL server is not always that easy and we only managed to get the data into SQL Server by installing a ODBC/OLEDB driver for FoxPro. Expect to have similar problems on 64-bit Windows as with Access. After migrating most of the issues we had with the Access conversion also applied to the SQL Server scenario. Thanks to the collegues in the breakout session we did get a result very fast though, even if it was to conclude that converting to the MS SQL Server did not change much. Apparently the FoxPro memo data is the source of most of our problems.
So we took a different approach. FoxPro storage is basically very close to the dBASE and DB2 storage formats and there are many tools out there that to a good job in converting dBASE and/or DB2. We tried converting the database using MIXED and allthough it is supposed to support FoxPro, it did not work. The differences in the main dbf format is well documented and seemed to match with the files of our database perfectly. We looked at the memo data files and the matched the documented format as well. The format documentation is somewhat hidden in the MSDN knowledge base. Here is the link to the latest version: http://msdn.microsoft.com/en-US/library/d863bcf2(v=vs.80).aspx. Be sure to read the updates in the comments as there seem to be errors in the documentation. The DBF file format is best explained on http://www.dbf2002.com/dbf-file-format.html. For the other file formats, the MSDN documentation is your only source.
We will be looking into open source dBASE file readers/converters like the MIXED drivers and adapt/upgrade them to handle FoxPro databases like ours. The next step will be to convert the database into an XML format (SIARD – DBML – anything that is standardized). The XML package will need to be documented and we will be looking for an application that can add the metadata to the package and make a complete SIP package ready to be ingested. The last step will be resurrection of the database into a life database for consultation purposess, depending on the requirements of the archiving institution. Still a long way to go, but at least we made significant progress during the Hackathon.
We learned that converting old databases on a 64-bit Windows platform is a bad idea and using other DB formats as intermediate step is only feasible if your old format is well supported and does not contain any complex data or large text/binary blobs. For the more complex cases we suggest to look into drivers or tools that can read the data and convert to the archival format directly or adapt/create them to make them work.
By Kris Dekeyser, posted in Kris Dekeyser's Blog
There are no comments on this post.