image-dataset-converter library release

Our wai-annotations library has been really useful over the years for converting annotated datasets from one format into another, whilst also offering the ability to apply inline stream processors to influence the data passing through. As with any framework that has grown organically over the years, it is some times necessary to take a step back and rethink the validity of past approaches.

Since I don’t like having lots of little code pieces lying around, I like the approach of command-line-based pipelines, an approach that wai-annotations followed. However, it was a bit awkward adding new plugins, especially at development time, as it relied on the entry_points section of the files of the modules in order to register plugins – which is not available at development time…

In order to make command-line pipelines easier, I have been developing the seppl for a while now (which came out of our llm-dataset-converter project). Using seppl as the basis, I migrated the code related to image dataset conversion to a new library (and added new functionality in the process), resulting in the image-dataset-converter library, which I finally had time to make a first release of:

Of course, there are resources available on how to use this new library as well: