Technology Blogs by Members
Explore a vibrant mix of technical expertise, industry insights, and tech buzz in member blogs covering SAP products, technology, and events. Get in the mix!
cancel
Showing results for 
Search instead for 
Did you mean: 
architectSAP
Active Contributor
Inspired by andreas.forster’s excellent blog SAP Data Intelligence: Create your first ML Scenario and encouraged by karim.mohraz’s incredibly helpful blog Train and Deploy a Tensorflow Pipeline in SAP Data Intelligence, in this blog I will combine both their approaches to demonstrate how to create an as plain vanilla as possible SAP Data Intelligence machine learning scenario with TensorFlow.

To start with, I create a Data Workspace and respective Data Collection in the SAP Data Intelligence ML Data Manager and upload Andreas’s training data there:


From my Jupiter Lab Data Manager, I get to my Data Collection and copy the code to load my training data:
import pandas as pd
import sapdi
ws = sapdi.get_workspace(name='architectSAP')
dc = ws.get_datacollection(name='architectSAP')
with dc.open('RunningTimes.csv').get_reader() as reader:
df = pd.read_csv(reader, sep=';')
df.head()

	ID	HALFMARATHON_MINUTES	MARATHON_MINUTES
0 1 73 149
1 2 74 154
2 3 78 158
3 4 73 165
4 5 74 172

On that basis, I build my data set:
x = df[['HALFMARATHON_MINUTES']]
y_true = df[['MARATHON_MINUTES']]
import tensorflow as tf
dataset = tf.data.Dataset.from_tensor_slices((x.values, y_true.values))
dataset = dataset.batch(1)
for feat, targ in dataset.take(5):
print('Features: {}, Target: {}'.format(feat, targ))
print(x.shape)
print(y_true.shape)

Features: [[73]], Target: [[149]]
Features: [[74]], Target: [[154]]
Features: [[78]], Target: [[158]]
Features: [[73]], Target: [[165]]
Features: [[74]], Target: [[172]]
(117, 1)
(117, 1)

To create, compile and train my model:
model = tf.keras.Sequential([tf.keras.layers.Dense(4, name='hidden', batch_size=1, input_shape=(1,)), tf.keras.layers.Dense(1, name='output')])
model.compile(tf.keras.optimizers.Adam(), tf.keras.losses.MeanSquaredError())
model.summary()
history = model.fit(dataset, epochs=8)

Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
hidden (Dense) (1, 4) 8
_________________________________________________________________
output (Dense) (1, 1) 5
=================================================================
Total params: 13
Trainable params: 13
Non-trainable params: 0
_________________________________________________________________
Epoch 1/8
117/117 [==============================] - 0s 2ms/step - loss: 80740.5312
Epoch 2/8
117/117 [==============================] - 0s 3ms/step - loss: 57319.2930
Epoch 3/8
117/117 [==============================] - 0s 3ms/step - loss: 36689.4297
Epoch 4/8
117/117 [==============================] - 0s 3ms/step - loss: 18188.9629
Epoch 5/8
117/117 [==============================] - 0s 3ms/step - loss: 6463.0679
Epoch 6/8
117/117 [==============================] - 0s 3ms/step - loss: 1668.1143
Epoch 7/8
117/117 [==============================] - 0s 3ms/step - loss: 459.1681
Epoch 8/8
117/117 [==============================] - 0s 4ms/step - loss: 292.9158


With very similar results to Andreas’s of course (red) versus the MSE optimum (green), but this time leveraging TensorFlow Keras:
import matplotlib.pyplot as plot
import numpy as np
m, b = np.polyfit(np.squeeze(x), y_true, 1)
plot.scatter(x, y_true);
plot.plot(x, model.predict(x), color = 'red');
plot.plot(x, m*x + b, color = 'green');
plot.xlabel("Actual Minutes Half-Marathon");
plot.ylabel("Actual Minutes Marathon");


And check that there is no auto correlation, by scatter plotting the residuals to verify their randomness:
plot.scatter(x, y_true - model.predict(x), color="orange");
plot.xlabel("Actual Minutes Half-Marathon");
plot.ylabel("Residuals");


For leveraging these results, in the SAP Data Intelligence ML Scenario Manager, I add a Python Producer to create, compile, train and store my model. Since I want to stay as plain vanilla as possible, I only add a few lines to the template and stick with its naming conventions:
import pandas as pd
import tensorflow as tf
import io
import json
import h5py

# Example Python script to perform training on input data & generate Metrics & Model Blob
def on_input(data):
# to send metrics to the Submit Metrics operator, create a Python dictionary of key-value pairs
df = pd.read_csv(io.StringIO(data), sep=';')
x = df[['HALFMARATHON_MINUTES']]
y_true = df[['MARATHON_MINUTES']]
dataset = tf.data.Dataset.from_tensor_slices((x.values, y_true.values))
dataset = dataset.batch(1)
model = tf.keras.Sequential([tf.keras.layers.Dense(4, batch_size=1, input_shape=(1,)), tf.keras.layers.Dense(1)])
model.compile(tf.keras.optimizers.Adam(), tf.keras.losses.MeanSquaredError())
history = model.fit(dataset, epochs=8)
# metrics_dict = {"kpi1": "1"}
metrics_dict = json.dumps({'loss': str(history.history['loss'][len(history.history['loss']) - 1])})

# send the metrics to the output port - Submit Metrics operator will use this to persist the metrics
api.send("metrics", api.Message(metrics_dict))

# create & send the model blob to the output port - Artifact Producer operator will use this to persist the model and create an artifact ID
f = h5py.File('blob', driver='core', backing_store=False)
model.save(f)
f.flush()
# model_blob = bytes("example", 'utf-8')
model_blob = f.id.get_file_image()
api.send("modelBlob", model_blob)

api.set_port_callback("input", on_input)

Since I use the TensorFlow Python libraries, I need to add a Group with a tag to identify my Docker image:
FROM $com.sap.sles.ml.python
RUN python3.6 -m pip --no-cache-dir install --user --upgrade pip
RUN python3.6 -m pip --no-cache-dir install --user tensorflow


I then execute my Python Producer with these parameters after changing the Connection in the Read File Operator:



To obtain my Metrics, Models and Datasets:


To consume these, in the SAP Data Intelligence ML Scenario Manager, I add a Python Consumer. Since I still want to stay as plain vanilla as possible, I only add a few lines to the template to apply my model and obtain my results:
# apply your model
blob = io.BytesIO(model)
f = h5py.File(blob, 'r')
architectSAP = tf.keras.models.load_model(f)
blob.close()
# obtain your results
prediction = architectSAP.predict([[json.loads(user_data)['half_marathon_minutes']]])
success = True

As well as pass the successful response to the user:
# apply carried out successfully, send a response to the user
# msg.body = json.dumps({'Results': 'Model applied to input data successfully.'})
msg.body = json.dumps({'marathon_minutes_prediction': str(prediction[0])})

I then deploy my Python Consumer after adding a Group for my Docker image again and passing in my model:



Once my Deployment is successful:


I retrieve my prediction e.g. with Postman:


I tried to keep this blog as plain vanilla as possible, to help you understand the underlying basic concepts. However, there is of course nothing wrong with a bit more sophistication like e.g. building a custom TensorFlow Operator to make your Graphs more efficient and easier to read:

4 Comments
Labels in this area