Amazon SageMaker inference launches faster auto scaling for generative AI models

By Admin 26/07/2024

AWS Machine Learning Blog Today, we are excited to announce a new capability in Amazon SageMaker inference that can help you reduce the time it takes for your generative artificial intelligence (AI) models to scale automatically. You can now use sub-minute metrics and significantly reduce overall scaling latency for generative AI models. With this enhancement, […]Continue reading

AI21 Labs Jamba-Instruct model is now available in Amazon Bedrock

By Admin 25/06/2024

AWS Machine Learning Blog We are excited to announce the availability of the Jamba-Instruct large language model (LLM) in Amazon Bedrock. Jamba-Instruct is built by AI21 Labs, and most notably supports a 256,000-token context window, making it especially useful for processing large documents and complex Retrieval Augmented Generation (RAG) applications. What is Jamba-Instruct Jamba-Instruct is […]Continue reading

Scale and simplify ML workload monitoring on Amazon EKS with AWS Neuron Monitor container

By Admin 25/06/2024

AWS Machine Learning Blog Amazon Web Services is excited to announce the launch of the AWS Neuron Monitor container, an innovative tool designed to enhance the monitoring capabilities of AWS Inferentia and AWS Trainium chips on Amazon Elastic Kubernetes Service (Amazon EKS). This solution simplifies the integration of advanced monitoring tools such as Prometheus and […]Continue reading

Build RAG applications using Jina Embeddings v2 on Amazon SageMaker JumpStart

By Admin 06/06/2024

AWS Machine Learning Blog Today, we are excited to announce that the Jina Embeddings v2 model, developed by Jina AI, is available for customers through Amazon SageMaker JumpStart to deploy with one click for running model inference. This state-of-the-art model supports an impressive 8,192-tokens context length. You can deploy this model with SageMaker JumpStart, a […]Continue reading

Falcon 2 11B is now available on Amazon SageMaker JumpStart

By Admin 31/05/2024

AWS Machine Learning Blog Today, we are excited to announce that the first model in the next generation Falcon 2 family, the Falcon 2 11B foundation model (FM) from Technology Innovation Institute (TII), is available through Amazon SageMaker JumpStart to deploy and run inference. Falcon 2 11B is a trained dense decoder model on a […]Continue reading

AWS Inferentia and AWS Trainium deliver lowest cost to deploy Llama 3 models in Amazon SageMaker JumpStart

By Admin 03/05/2024

AWS Machine Learning Blog Today, we’re excited to announce the availability of Meta Llama 3 inference on AWS Trainium and AWS Inferentia based instances in Amazon SageMaker JumpStart. The Meta Llama 3 models are a collection of pre-trained and fine-tuned generative text models. Amazon Elastic Compute Cloud (Amazon EC2) Trn1 and Inf2 instances, powered by […]Continue reading

Databricks DBRX is now available in Amazon SageMaker JumpStart

By Admin 26/04/2024

AWS Machine Learning Blog Today, we are excited to announce that the DBRX model, an open, general-purpose large language model (LLM) developed by Databricks, is available for customers through Amazon SageMaker JumpStart to deploy with one click for running inference. The DBRX LLM employs a fine-grained mixture-of-experts (MoE) architecture, pre-trained on 12 trillion tokens of […]Continue reading

Introducing automatic training for solutions in Amazon Personalize

By Admin 20/04/2024

AWS Machine Learning Blog Amazon Personalize is excited to announce automatic training for solutions. Solution training is fundamental to maintain the effectiveness of a model and make sure recommendations align with users’ evolving behaviors and preferences. As data patterns and trends change over time, retraining the solution with the latest relevant data enables the model […]Continue reading

Use Kubernetes Operators for new inference capabilities in Amazon SageMaker that reduce LLM deployment costs by 50% on average

By Admin 19/04/2024

AWS Machine Learning Blog We are excited to announce a new version of the Amazon SageMaker Operators for Kubernetes using the AWS Controllers for Kubernetes (ACK). ACK is a framework for building Kubernetes custom controllers, where each controller communicates with an AWS service API. These controllers allow Kubernetes users to provision AWS resources like buckets, […]Continue reading

Fine-tune Code Llama on Amazon SageMaker JumpStart

By Admin 18/03/2024

AWS Machine Learning Blog Today, we are excited to announce the capability to fine-tune Code Llama models by Meta using Amazon SageMaker JumpStart. The Code Llama family of large language models (LLMs) is a collection of pre-trained and fine-tuned code generation models ranging in scale from 7 billion to 70 billion parameters. Fine-tuned Code Llama […]Continue reading