It can be challenging to take a model that a technical team has created and deploy it to other users so that they can quickly and easily draw insights from the results. Using Dataiku DSS webapps can help mitigate this lift. In this two-part blog, we will break down how to create and publish a webapp that can predict a value by passing user inputs to a trained machine learning model. 

 

PART 1 – Setting up a Webapp With a Basic UI 

What are Webapps in Dataiku? 

A web app generally refers to an application on a web browser where an end-user can interact with it, like an online shopping cart or healthcare portal. 

In Dataiku, a webapp is an application created in and hosted within DSS that allows for the development of interactive visualizations and the creation of a front-end application that will enable users to interact with underlying data, models, and other processes. Dataiku supports 4 types of webapps: “Standard”, R Shiny, Bokeh, and Dash. Our focus will be on Dash, which is a Python-based library. Dataiku also develops Visual Webapps, which allow users who are less comfortable with programming languages to create code webapps as plugins for reuse. 

Problem Background  

We will build a public webapp in DSS using Dash that will allow a user to generate predictions of alcohol by volume (%) for a wine based on several inputs. We will not go through the details of the model creation, but here are some important notes:  

  • Used publicly available wine quality dataset to train the models  
  • Created two linear regression models (one for red wines and one for white) to predict the alcohol by volume (%)  
  • We dumped the trained models into pickle (.pickle) files to use in our webapp 

How to Build a Webapp in Dataiku DSS 

1. Upload Model Files to Your DSS Project.  

We created a blank project on our DSS instance and must upload the .pickle files containing our trained models. DSS does not natively support this file type, so users must upload it to a managed folder. 

*Note: if the models are modified (such as new features added or retrained on new data), these files will need to be re-uploaded to the folder  

To create a managed folder, navigate to the flow view, select “+ DATASET,” and then “Folder” from the dropdown.  

To create a managed folder, navigate to the flow view, select "+ DATASET," and then “Folder” from the dropdown.  

Drag and drop your .pickle files to add them to the folder. 

2. Create a Webapp

To access webapps within the project, hover over the code symbol (</>) in the top bar and select “Webapps” from the dropdown.

Click the large orange button to create your first webapp and select “Code Webapp” from the menu.

DSS provides support for four types of webapps, and it will display all of them for you to choose from. For creating the UI and backend of our webapp, we will use Dash. 

Choose “Dash” and select “An empty Dash app” to create your webapp. 

After hitting create, you will be automatically moved to the “Settings” tab and shown a Python script along with the “Preview” page.

3. How to Set up the Backend of a Dataiku Webapp

We need to start the backend to run our webapp and take advantage of the interactive preview while developing. To do so, we must use or set up a code env with the dash package and any required packages to run the models we uploaded. You can find a complete list of packages for running the example project in the project Wiki.  

 Once you have created or identified the appropriate code env, navigate to “Settings” and select that environment. You will then need to hit “SAVE” to start the backend. 

After the backend starts, the system will automatically return you to the “Preview” mode. 

4. Add Our Custom Functionality.

 Dataiku provides some starter code, but we will remove this and build our own script. As you add script components, remember to save your work continually. The continuous saves will prevent you from losing progress and populate the interactive preview as you add components to the script.   

a. Import necessary packages 

First, we need to import all the necessary packages for our project. The first three packages are essential for accessing and running our models, while the remaining packages are used for creating our UI. 

b. UI Layout

The next step is to set up the style of our UI for reuse throughout our script. We can define components such as font colors, background colors, and font styles.

Once we have the styles defined, we can use them in our layout. We start by defining the layout’s default style and adding components such as a title and description. These Dash HTML components are available here, and users can find further information about defining and using them on this page. 

c. Adding Inputs 

We need to provide input fields for users to interact with our model. Our webapp will utilize three different kinds of inputs, all of which are Dash Core Components. 

i. Dropdown 

We will create a dropdown to select the type of wine and, consequently, the appropriate model to run. Our dropdown will have two options, red or white wine. There are three properties we are going to define for our dropdown:  

          • Id: unique identifier for the component  
          • Options: a list of dictionaries containing the label and value for each item in the dropdown  
          • Placeholder: text that will appear in the dropdown box before the user selects

 ii. Input

Two fields will take a number via an input component. The Dash input field also supports eight other data types besides numbers. The following properties will define each input:

          • Id: unique identifier for the component  
          • Type: the kind of value to accept  
          • Value: the default value of the input 

iii. Slider

Since the pH value exists within a limited range, we will use a slider component. The slider will show the user a fixed range of values and use a fixed step size to select values by. The following properties will define the slider: 

          • Id: unique identifier for the component  
          • Value: the default value of the slider  
          • Min: minimum value of the slider  
          • Max: maximum value of the slider  
          • Step: the interval between values  
          • Marks: points displayed on the slider; allows for customization of labels and style 

We are also going to add an HTML division after the slider to show the user exactly which value they have selected since we do not have a mark for every value.  This functionality will be set up using callbacks later. 

At this point, you should be able to see the UI we have created. In the next post, we will continue developing our webapp by adding the ability to call the model and the steps for deploying the final webapp publicly. 

PART 2 – Adding Interactive Components and Deploying the Webapp 

d. Calculate button

The final UI component to set up is a button for the user to click to submit their inputs and receive the prediction from the model. The button is a Dash HTML component and has the following properties: 

      • Id: unique identifier for the component  
      • N_clicks: used to track the number of times the button has been clicked  
      • Children: the label to appear on the button 

While this will show the button on the UI, the functionality will be defined in a callback that runs the modelsWe will also add an HTML division under the button to show the result of the model. 

e. Callback Functions

We use callback functions to take the user inputs, pass them to another component, and generate an output using them. We will use callback functions in our webapp to show users what value they have selected on the pH slider. 

i. Show Values

We will use a callback function to show the user the explicit value they have selected on the slider. The first step is to define the Outputs and Inputs of the callback function using the component ids we assigned earlier. The Output will be the HTML division we created, and the Input will be the value of the Slider component. 

Next, we need to set up a function to return the valueThis is done using a simple Python function that will print out the value. 

ii. Running the model

We must carry out five steps to run the model. First, we need to define our Outputs, Inputs, and States. The Output will be the HTML division created to show the result, and this time our input is the number of times the button has been clicked. We are also including a State for each additional input. By assigning these as States rather than Inputs, we ensure that the model will only be called when the button is clicked rather than each time a value is changed. 

Next, we need to set up a function to carry out the rest of the process. This function will be more complex than the previous one since we must utilize multiple inputs and access our model.  

Within the function, we first need to check that the button we created has been pushed, which we can do by checking the number of times it has been clicked and using that value in an if statement. 

Once the button is clicked, we will check that all the inputs are valid, using a series of if statements, before passing them to a model. We have two defined options and “None,” indicating the user has not selected a value. For “None,” we will return an error message to alert the user that they need to choose a type. We also want to make sure that the user has input values in the provided range to prevent nonsensical answers from being returned. 

If the user has selected either “Red wine” or “White wine,” we need to input a few lines of code to open the corresponding .pickle file, run the model to generate our prediction and return the predicted value. We will use the DSS Python API to access the appropriate file in the Managed Folder we created to do this. Once the file is open, we can pass in our values and return the prediction to the UI. 

5. Deploying the webapp publicly 

At this point, our webapp is fully functional within our DSS instance. Deploying the webapp publicly will give us a URL that we can distribute to users to interact with our webapp without having to access or log in to our DSS instance directly. An instance Administrator can carry out these steps, as they are entirely optional. 

To start, let’s copy the current URL of our webapp into a notepad. This will enable us to access a few key pieces of information that are stored in the URL more easily. (ex. http://DSS_BASE_URL/projects/ WEBAPPBLOGDASH/webapps/osPwXez_winemodelwebapp/view) 

 Then, navigate to Administration > Settings > Login & Security and scroll to find the “Webapp” section. 

Within this section, we can find an area titled “Public webapps,” where we will add our webapp to make it public. To do so, we will add the PROJECTKEY.webAppId, both found in the URL we copied down (ex. WEBAPPBLOGDASH.osPwXez).    

Once added, you must then go back to the webapp and restart the backend to access it via the URL. You can now access your webapp using the URL formatted like http://DSS_BASE_URL/public-webapps/PROJECT_KEY/WEBAPP_ID.   

If you would like a cleaner version of the URL, you can add the PROJECTKEY.webAppId to the section “Vanity URLs” and map it to your chosen keyword.    

This will create a URL formatted like http://DSS_BASE_URL/public-webapps/KEYWORD. Again, be sure to restart the backend for the changes to take effect. 

The webapp we have built is currently available here for you to test. While this blog covers some of the basics of webapp development in Dataiku, our team at Aimpoint Digital has experts that can help you design and develop custom webapps to suit your business needs. 

To obtain a copy of the project and get in touch with our team, please fill out the form below.