Building an SAP-RPT-1 “Hello World” with AI Core
Using SAP-RPT-1 in the playground is a nice way to get a first impression, but you’ll soon realize there are some limitations. In this blog post, let’s explore how to use SAP-RPT-1 in a production-like environment. I’ll walk you through the steps to build a simple “Hello World” application using SAP-RPT-1 and AI Core.
Creating a Deployment for SAP-RPT-1
Throughout this blog post, I’ll assume that you have access to a BTP subaccount with instances of the AI Core service (extended plan) and the AI Launchpad service (standard plan).
First, you need to create a configuration for SAP-RPT-1 with the following parameters:
- Configuration Name: For example,
sap-rpt-1-largeorsap-rpt-1-small(I recommend naming the configuration so that you can easily recognize the underlying model) - Scenario:
foundation-models - Version:
0.0.1(default value) - Executable:
aicore-sap(meaning that this model is hosted by SAP; you can find this value in note 3437766) - Model Name:
sap-rpt-1-largeorsap-rpt-1-small(again, as per note 3437766) - Model Version:
latest
In the next step, you can turn this configuration into a deployment by clicking on the “Create Deployment” button in the configuration overview. You can leave all the default values as they are and create the deployment.
After a few minutes, the deployment should reach the status “Running” and you can start using it. Note that a deployment URL has also been created, something like https://api.ai.prod.eu-central-1.aws.ml.hana.ondemand.com/v2/inference/deployments/1234567890abcdef. The most important part is the deployment ID at the end of the URL (in this example 1234567890abcdef), which you’ll need later on.
Handling AI Core Authentication
At the time of writing, the SAP Cloud SDK for Python does not yet support the RPT-1 models. Following the documentation, we will therefore use the requests library to call the deployment endpoint directly.
Before we can do that, however, we need to authenticate as also described in the documentation. What may look intimidating at first is actually quite straightforward. We only need a few parameters which we’ll store securely in a .env file. You can collect this information from the service key of your AI Core instance.
AICORE_API_URL="https://api.ai.prod.eu-central-1.aws.ml.hana.ondemand.com/v2"
AICORE_AUTH_URL="https://a1b2c3d4e4f61234.authentication.eu10.hana.ondemand.com" # called "url" in the service key
AICORE_CLIENT_ID="sb-95d9f03d" # shortened, yours will be longer
AICORE_CLIENT_SECRET="c51ad55c" # shortened, yours will be longer
AICORE_RESOURCE_GROUP="default" # your actual resource group
Additionally, store the RPT deployment ID(s) you collected earlier:
RPT1L_DEPLOYMENT_ID="1234567890abcdef" # your actual deployment ID for RPT-1 large
RPT1S_DEPLOYMENT_ID="abcdef1234567890" # your actual deployment ID for RPT-1 small
Once the .env file is ready, we can use the following code to retrieve an access token from the authentication endpoint:
#!pip install python-dotenv # install if not already installed
import os
import requests
from dotenv import load_dotenv
# Load .env file
load_dotenv()
AICORE_AUTH_URL = os.environ["AICORE_AUTH_URL"]
AICORE_CLIENT_ID = os.environ["AICORE_CLIENT_ID"]
AICORE_CLIENT_SECRET = os.environ["AICORE_CLIENT_SECRET"]
token_url = f"{AICORE_AUTH_URL}/oauth/token"
response = requests.post(
token_url,
auth=(AICORE_CLIENT_ID, AICORE_CLIENT_SECRET),
data={"grant_type": "client_credentials"},
)
response.raise_for_status()
token_data = response.json()
access_token = token_data["access_token"]
Preparing Test Data
For testing the connection to the RPT-1 deployment, we’ll use the same test data we used when trying out RPT-1 in the playground. Here’s the test data in the format RPT-1 expects:
Note: For the next steps, I also created a Jupyter notebook, which you can download to interactively do all the steps below yourself.
payload = {
"prediction_config": {
"target_columns": [
{
"name": "SALESGROUP",
"prediction_placeholder": "[PREDICT]"
# "task_type": "classification" or "regression" can be specified here if needed
}
]
},
"index_column": "ID",
"rows": [
{
"ID": "1001",
"PRODUCT": "Tablet",
"PRICE": 599.00,
"CUSTOMER": "TechStart Inc",
"COUNTRY": "USA",
"SALESGROUP": "[PREDICT]"
},
{
"ID": "1002",
"PRODUCT": "Standing Desk",
"PRICE": 325.50,
"CUSTOMER": "Workspace Solutions",
"COUNTRY": "Germany",
"SALESGROUP": "[PREDICT]"
},
{
"ID": "1003",
"PRODUCT": "Workstation",
"PRICE": 1450.00,
"CUSTOMER": "Enterprise Systems Ltd",
"COUNTRY": "Canada",
"SALESGROUP": "Enterprise Solutions"
},
{
"ID": "1004",
"PRODUCT": "Laptop Pro",
"PRICE": 1899.99,
"CUSTOMER": "Business Corp",
"COUNTRY": "UK",
"SALESGROUP": "Enterprise Solutions"
},
{
"ID": "1005",
"PRODUCT": "Gaming Laptop",
"PRICE": 1250.00,
"CUSTOMER": "Digital Ventures",
"COUNTRY": "USA",
"SALESGROUP": "Enterprise Solutions"
},
{
"ID": "1006",
"PRODUCT": "Smart Watch",
"PRICE": 299.99,
"CUSTOMER": "Gadget Store",
"COUNTRY": "Australia",
"SALESGROUP": "Consumer Electronics"
},
{
"ID": "1007",
"PRODUCT": "Ergonomic Chair",
"PRICE": 445.00,
"CUSTOMER": "Office Outfitters",
"COUNTRY": "France",
"SALESGROUP": "Office Furniture"
},
{
"ID": "1008",
"PRODUCT": "Storage Array",
"PRICE": 3500.00,
"CUSTOMER": "CloudTech Systems",
"COUNTRY": "Singapore",
"SALESGROUP": "Data Infrastructure"
},
{
"ID": "1009",
"PRODUCT": "Network Switch",
"PRICE": 175.50,
"CUSTOMER": "ConnectIT",
"COUNTRY": "Japan",
"SALESGROUP": "Networking Devices"
}
]
}
For better readability, here is the data in table format:
import pandas as pd
sample_data = pd.DataFrame(payload["rows"])
sample_data
| 1001 | Tablet | 599.00 | TechStart Inc | USA | [PREDICT] |
| 1002 | Standing Desk | 325.50 | Workspace Solutions | Germany | [PREDICT] |
| 1003 | Workstation | 1450.00 | Enterprise Systems Ltd | Canada | Enterprise Solutions |
| 1004 | Laptop Pro | 1899.99 | Business Corp | UK | Enterprise Solutions |
| 1005 | Gaming Laptop | 1250.00 | Digital Ventures | USA | Enterprise Solutions |
| 1006 | Smart Watch | 299.99 | Gadget Store | Australia | Consumer Electronics |
| 1007 | Ergonomic Chair | 445.00 | Office Outfitters | France | Office Furniture |
| 1008 | Storage Array | 3500.00 | CloudTech Systems | Singapore | Data Infrastructure |
| 1009 | Network Switch | 175.50 | ConnectIT | Japan | Networking Devices |
Running the predictions
To run the predictions, we will use the requests library to send an HTTP POST request to the RPT-1 deployment endpoint. The request will include our test data and the authorization token.
AICORE_API_URL = os.environ["AICORE_API_URL"].rstrip("/")
AICORE_RESOURCE_GROUP = os.environ.get("AICORE_RESOURCE_GROUP", "default")
RPT1_DEPLOYMENT_ID = os.environ.get("RPT1L_DEPLOYMENT_ID")
if not RPT1_DEPLOYMENT_ID:
raise ValueError("Missing RPT1L_DEPLOYMENT_ID in .env (deployment id from AI Launchpad).")
url = f"{AICORE_API_URL}/inference/deployments/{RPT1_DEPLOYMENT_ID}/predict"
headers = {
"Authorization": f"Bearer {access_token}",
"AI-Resource-Group": AICORE_RESOURCE_GROUP,
"Content-Type": "application/json",
"Accept": "application/json",
}
response = requests.post(url, headers=headers, json=payload, timeout=120)
response.raise_for_status()
Analyzing the results
Let’s take a look at the results returned by RPT-1. Here are the predictions in the raw JSON format:
import json
data = response.json()
preds = data["predictions"]
print(json.dumps(preds, indent=2, ensure_ascii=False))
[
{
"ID": 1001,
"SALESGROUP": [
{
"confidence": 0.93,
"prediction": "Enterprise Solutions"
}
]
},
{
"ID": 1002,
"SALESGROUP": [
{
"confidence": 0.78,
"prediction": "Office Furniture"
}
]
}
]
Note that the result looks different from what we saw in the playground, which hosts the RPT-1-OSS model. RPT-1-Large also returns a confidence score for each prediction.
Let’s merge these predictions back into the tabular format to see the results more clearly:
preds_df = sample_data.copy(deep=True)
preds_df["ID"] = preds_df["ID"].astype(int)
for pred in preds:
row_idx = int(pred["ID"])
predicted_value = pred["SALESGROUP"][0]["prediction"]
preds_df.loc[preds_df["ID"] == row_idx, "SALESGROUP"] = predicted_value
preds_df
| 1001 | Tablet | 599.00 | TechStart Inc | USA | Enterprise Solutions |
| 1002 | Standing Desk | 325.50 | Workspace Solutions | Germany | Office Furniture |
| 1003 | Workstation | 1450.00 | Enterprise Systems Ltd | Canada | Enterprise Solutions |
| 1004 | Laptop Pro | 1899.99 | Business Corp | UK | Enterprise Solutions |
| 1005 | Gaming Laptop | 1250.00 | Digital Ventures | USA | Enterprise Solutions |
| 1006 | Smart Watch | 299.99 | Gadget Store | Australia | Consumer Electronics |
| 1007 | Ergonomic Chair | 445.00 | Office Outfitters | France | Office Furniture |
| 1008 | Storage Array | 3500.00 | CloudTech Systems | Singapore | Data Infrastructure |
| 1009 | Network Switch | 175.50 | ConnectIT | Japan | Networking Devices |
For comparison, here is the original sample data before predictions:
sample_data
| 1001 | Tablet | 599.00 | TechStart Inc | USA | [PREDICT] |
| 1002 | Standing Desk | 325.50 | Workspace Solutions | Germany | [PREDICT] |
| 1003 | Workstation | 1450.00 | Enterprise Systems Ltd | Canada | Enterprise Solutions |
| 1004 | Laptop Pro | 1899.99 | Business Corp | UK | Enterprise Solutions |
| 1005 | Gaming Laptop | 1250.00 | Digital Ventures | USA | Enterprise Solutions |
| 1006 | Smart Watch | 299.99 | Gadget Store | Australia | Consumer Electronics |
| 1007 | Ergonomic Chair | 445.00 | Office Outfitters | France | Office Furniture |
| 1008 | Storage Array | 3500.00 | CloudTech Systems | Singapore | Data Infrastructure |
| 1009 | Network Switch | 175.50 | ConnectIT | Japan | Networking Devices |
Conclusion
We’ve successfully transitioned from using SAP-RPT-1 in the playground to deploying and using it via AI Core. This setup allows you to use RPT-1 in a scalable, production-like environment. On the playground, for example, you quickly hit limits if you try more complex scenarios or larger datasets. Using a deployment via AI Core overcomes these limitations.