Dissimilar: an experimental Image Quality Assurance tool
An important part of image file format migration is quality assurance. Various tools can be used such as ImageMagick or Matchbox, but they only provide one metric or are for different use-cases. I wanted to investigate implementation of image comparison algorithms so began investigating.
I created a prototype tool/library for image quality analysis, called Dissimilar. I had previously prototyped a tool that used the OpenCV libraries in Java to perform image comparisons. Those experiments showed that, while possible, it was not ideal; a large native-code shared object needed to be packaged with the tool and some inline memory management was required.
For Dissimilar I subsequently implemented PSNR and SSIM algorithms from scratch in Java, making use of Apache-Commons Imaging and Math3 libraries. The result is about 600 lines of commented, pure-Java code for performing image quality analysis.
The SSIM is calculated for an image by splitting it in to 8 pixel by 8 pixel “windows” and then calculating the mean of the results for each window. In addition to the (mean) SSIM value, Dissimilar reports the minimum SSIM value alongside the variance in SSIM values. It may be useful to use a combination of some of the mean, minimum and variance to set a better threshold for image format migration. For example, setting a minimum value would ensure that the quality of all 8×8 windows stayed above a certain threshold. Or using variance would enable identification of images where there were large differences in the individual SSIM windows, but where those values might still produce a mean that is assessed as ok.
Testing was performed using our Hadoop cluster to enable comparison of results from ImageMagick (PSNR) and Dissimilar (PSNR/SSIM). A tiff was migrated to lossy jp2 and then back to tiff. The original tiff and second tiff were then compared using each tool, each tool therefore having identical inputs.
It is worth noting that there is no built-in support for JPEG2000 files in Apache-Commons Imaging, and it is worth using a known decoder to decompress to tiff for comparison. For more about that see our iPres paper in September.
Results on a homogenous dataset of 1000 greyscale image files showed that ImageMagick took about half the execution time of Dissimilar. This is a good result as the code is currently unoptimised. The execution time of Dissimilar also includes startup of a new JRE, an SSIM calculation and saving an SSIM “heatmap” image to identify the low values. Some execution speed savings are therefore expected. It is possible to call the code as a library – this could be done as part of a Java workflow, thus removing the overhead of a new JRE. Some information regarding the difference between using a Java library versus executing a new JRE has been blogged about before.
The PSNR results were identical to that of ImageMagick. The SSIM results were not the same as Matchbox’s but I think it and Dissimilar calculate SSIM in different ways. I couldn’t find another readily available and tested tool to calculate SSIM to verify the results – suggestions are welcome!
Next steps include testing more files, producing more unit tests, optimisation and identifying suitable values for the threshold of SSIM-mean, SSIM-minimum and SSIM-variance. I am also going to investigate adding more types of image quality assessment metrics.
By willp-bl, posted in willp-bl's Blog
There are no comments on this post.