Hugging Face has introduced its latest offering, Hugging Face Generative AI Services (HUGS), aims to simplify the deployment and scaling of generative AI applications using open source models.
HUGS is based on Hugging Face technologies such as Transformers and Text Generation Inference (TGI) and promises optimized performance for various hardware accelerators.
For developers using AWS or Google Cloud, the service is available for $1 per hour per container, with a five-day free trial on AWS to get users started.
Streamlining AI with zero-configuration inference
HUGS provides developers with a solution to run AI models on their own infrastructure without the need for manual configuration. One of the key challenges when implementing large language models (LLMs) is optimizing them for specific hardware environments. Every accelerator, whether it’s an NVIDIA GPU or an AMD GPU, requires fine tuning to achieve maximum performance.
With HUGS, these optimizations are managed automatically, delivering high throughput out-of-the-box. In addition to NVIDIA and AMD GPUs, the company promises that support will soon expand to AWS Inferentia and Google TPUs.
Hugging Face aims to ease the transition from black-box APIs to open, self-hosted solutions with support for a wide range of models, including well-known LLMs such as Llama and Gemma, with plans to soon introduce multi-modal models such as Idefics and Llava . . The company says it will integrate models like BGE and Jina in the future, giving developers even more options to customize their AI applications.
This service uses standardized APIs compatible with OpenAI’s model interfaces, allowing developers to migrate their own code.
Especially for startups, HUGS offers the opportunity to build AI applications without the high costs associated with proprietary platforms. The availability of one-click deployments on DigitalOcean makes it even easier for small teams to experiment with generative AI technologies.
Meanwhile, larger enterprises can use HUGS to scale their applications without being tied to a single cloud provider or proprietary API. On DigitalOcean, HUGS is included for free in addition to the standard cost of GPU Droplets. Hugging Face also offers custom deployment solutions for enterprises through the Enterprise Hub.