Deploying NudgeBee AI on AWS SageMaker
Overview
This guide walks you through deploying the NudgeBee AI model on AWS SageMaker.
Prerequisites
- AWS account with access to SageMaker, S3, IAM, and ECR
- Trained NudgeBee model (
.tar.gz) - Model uploaded to an S3 bucket
- IAM role with required permissions
Step-by-Step Guide
Step 1: Upload Model to S3
- Log in to AWS Console
- Go to Amazon S3
- Create or open a bucket
- Upload
nudgebee_model.tar.gzto the bucket
Step 2: Create SageMaker Model
- Open Amazon SageMaker
- Go to Models > Create model
- Name the model (e.g.,
nudgebee-ai-model) - Provide ECR container image URL
- Add S3 path to the model artifact
- Choose IAM role with permissions
- Click Create model
Step 3: Deploy Endpoint
- Go to Inference > Endpoint Configurations
- Create new configuration and add the model
- Choose instance type (e.g.,
ml.m5.large) - Create endpoint and wait for it to become InService
RAG and LLM Server Configuration
RAG Server (SageMaker)
EMBEDDINGS_PROVIDER=sagemaker
EMBEDDINGS_PROVIDER_REGION=<AWS_SageMaker_Region>
EMBEDDINGS_PROVIDER_API_ENDPOINT=<SageMaker_Endpoint_URL>
EMBEDDINGS_MODEL_NAME=<Model_Name>
LLM Server (SageMaker)
LLM_PROVIDER=sagemaker
LLM_PROVIDER_API_ENDPOINT=<SageMaker_Endpoint_URL>
LLM_PROVIDER_REGION=<AWS_SageMaker_Region>
Testing
- Use SageMaker Console > Endpoints > Test
- Provide a JSON request and validate response