Torchserve: Production Ready Inference for LLMs

My first project at Neeva was to replace out existing python Flask app, with Torchserve. Torchserve proports to be a production ready framework for HuggingFace and Pytorch models. The tl;dr is, well, it is! In addition, you can transform non-pytorch models into pytorch for serve time 😉.

read all the gory details here

comments powered by Disqus