How to speed up I/O-intensive tasks with multithreading and asyncio
Recently I had to perform a batch processing task where a thousands of images were downloaded from S3, the images were processed and then uploaded to a new bucket in S3. As the processing was relatively lightweight, most of the computation time was spent on downloading and uploading images, that is, I/O. Such I/O bound tasks are a great fit for multithreading (CPU-bound tasks better fit multiprocessing, with all its quirks related to serialization). In this post, I'd like to share a small example how to run tasks in a thread pool.