ML Infrastructure

Overview If you’re running Apache Beam pipelines with GPUs or large ML models, you’ve probably hit “CUDA out of memory” errors. Here’s what’s happening: each worker process loads its own copy of your model, eating up memory until everything crashes. Apache Beam 2.49.0 added MultiProcessShared to fix this. It lets you share one copy of a resource (like a GPU model) across all processes on a worker, instead of loading it separately in each process. This can drop your memory usage from 24GB to 3GB. ...

ML Infrastructure

How to Use Apache Beam's MultiProcessShared (and Why You Need It)

AI Platform with Python at Karrot - Inference Pipeline Part(Pycon Korea 2025)