Demo: Automated Machine Learning Regression by Amy Boyd

0:00
In this demo, we're going to take a look at the support ticket system at Tailwind Traders and how we can use automated machine learning to help their support ticket team
0:14
So Tailwind Traders is a fictitious retail company and their amazing development team have created this fantastic website as well as this native application
0:27
However, as we all know, with new technology, when more and more people use it, be that partners and customers, sometimes they might find some issues with inside the application or the website
0:43
In the case of Tailwind Traders, they have a support ticket system
0:47
So customers and partners will submit support tickets whenever they find some kind of issue with the application that they need support with
0:55
Tailwind Traders has an amazing support ticket team and they triage these different tickets and make sure that they get solved as quickly as possible
1:03
But these support tickets keep flying in and the support ticket team are starting to feel a little overwhelmed
1:09
they did tell us that most people walk away very happy with their problem being solved quickly
1:16
however unfortunately some of the customers do not have a great experience and
1:20
and find long wait times and so tailwind traders management team asked this question
1:26
how can we help the support team that's how broad often data science questions can be it's can we
1:35
find something in the data that can support us. What is it that is happening? What are
1:39
the trends? And so before we start any machine learning problem, I just want to caveat that
1:44
you need to explore the data. You need to understand what's in the data and the different distributions
1:50
and anything that might point towards an indicator that might be useful to do some form of prediction
1:56
on. In this case, we have already done that. And one of the things I wanted to show you
2:01
very quickly is what the data actually looks like so this is the data we have
2:11
support to get IDs and each support ticket has a row we have different
2:16
information from geography to rating to theme to where it's from to who you are
2:21
to the date that's created and you can see we've also done some data engineering
2:28
in here in order to be able to create columns that machine learning really really likes so often
2:34
numeric or categorical columns and so now I've got my data in order one of the things that I
2:40
wanted to see if we could do is actually predict this duration column so as a support ticket lands
2:47
in the support ticket team system I want to be able to predict how many hours or days that this
2:54
support ticket will take depending on all its other variables and we want to see if we can
3:00
learn this from historical data. The idea of this is if we can predict the duration we might be able
3:07
to be more flexible on busy periods, staffing hours, as well as start to look at things like
3:12
maybe we could incorporate a chatbot that would help us with frequently asked questions so that
3:18
our support ticket team can focus on the harder problems. So this is our first step we're going to
3:23
try and create a machine learning algorithm that can predict the duration of a support ticket
3:30
So what I'm going to do is I'm going to go to the Azure portal here and inside Azure the way you get started is you click
3:41
on the portal menu click create a resource go to AI and machine learning and select machine learning
3:48
From here you can fill in some basic information about the workspace
3:56
Just to note, if you are wanting to use automated machine learning or the designer facilities inside the UI of Azure Machine Learning
4:06
when you go to workspace edition, make sure you click enterprise. In the basic edition, you can use the Azure Machine Learning SDK, and that includes the automated machine learning SDK
4:19
But this is code first. We're doing low code first, and the enterprise version gives us that capability
4:28
I'm going to nip out of this and go to a setup I've already used
4:35
So here's my machine learning setup. And as you can see, there's not a great deal of information
4:41
And here I would probably say this is more for the management of the resources, just like you see with most things inside the Azure portal
4:48
However, as someone who's using the studio, I actually spend most of my time in the Azure Machine Learning Studio
4:55
If you click the launch now button, it opens in a new tab
4:59
And this is called the Microsoft Azure Machine Learning Studio. and in here we can do everything from using the SDK with notebooks to automated machine learning
5:09
to drag and drop interfaces. We can manage data sets, experiments, models and endpoints
5:15
as well as managing any compute resources or data stores that we want to use
5:19
It really is an end-to-end environment. And so first things first as you come in here we need to work on the data So if I click on datasets we already have our two datasets here as well as being versioned and we can see who created them
5:36
and when they were last modified. But if you wanted to add a dataset, you can simply drop down this Create a Dataset
5:42
You can create it from a local file, or you can actually create something called a Datastore
5:49
And so when you go to Datastores and you add a new Datastore
5:52
you can actually attach to a storage account that is already available
5:57
So now we have our data sets in place. I'm going to select the automated machine learning tab under author on the left
6:06
As you can see, I've done some automated machine learning runs prior to this
6:10
and we can see they take around 30 minutes ish to run. Therefore, I'm going to create the experiment and then we'll review a previously run one
6:19
So if I click on create a new automated machine learning experiment, it has some breadcrumbs
6:25
of the different stages that we're going to go through to create this run
6:30
First things first, we're going to select our data set. You could add the data set from here if you've not already done it previously
6:35
However, we have our support ticket data just here
6:48
can create a new experiment and give it a name. I think of an experiment as like a folder
6:53
of keeping all the different experiments and runs you do inside one name. So I'd want to
6:59
assign this Tailwind Traders Support Ticket V2. I can select my target column. This is
7:12
the column that you want to predict and so in our case this is duration and you can also select the
7:21
training compute cluster so this is underneath compute you can use things called azure machine
7:28
learning compute this is scalable compute so i just tell it that i need either a certain type
7:34
of development machines that's my cpu compute you can also create gpu compute but basically it's
7:40
multiple nodes and when an experiment is submitted to them they scale up and when they're actually
7:47
idle they automatically scale back down again so you really do only pay for what you use
7:54
In this case I'm going to use my CPU compute cluster and click next. Next we're going to define
8:01
the type of task. Well I want to predict the duration column. The duration column is a numeric
8:06
value and it stands for the amount of time that has passed or they took to
8:11
solve the support ticket therefore we're trying to predict continuous numeric values and so this is a regression problem how much or how many so I'm
8:21
going to click on regression you'll notice down here there's extra feature
8:25
settings you can set up so if we click on view additional configuration you can
8:31
select a primary metric this is what it's going to try and optimize on however
8:34
you do get all of the metrics also delivered back to you. For my regression problem I'm going to
8:40
choose normalise root mean squared error. There is a great page in the documentation that will allow
8:46
you to go ahead and read about all these amazing metrics. There's also this ability to explain the
8:54
best model. If you click on this it automatically shows explainability on the best model that's
9:00
created. Explaining a model is really, really useful for responsible machine learning. So the
9:07
ability for you to understand what columns and features inside your data set affect your model
9:13
And so I'm going to keep that as explained. Obviously, this will take a little bit of extra time
9:18
If you know that there's certain algorithms that you don't want to be using, maybe you've tried
9:22
them already. Maybe you know that they're actually going to be very, very heavy on compute and you
9:27
don't have the space for that or maybe they're really big models and you're trying to put them
9:30
on small devices you can choose to block certain algorithms so that the automated machine learning
9:36
does not try them we're going to leave that empty you can select the training time in hours so we
9:44
could select don't run any more than one hour that can be saying hey like this is how much i want to
9:50
budget for compute in some senses and also you can be very careful then about the how long you
9:59
want it to run how many runs you want it to do etc and the trade-off there is always a trade-off
10:04
between training and accuracy in most cases. If I drop down validation it's going to automatically
10:12
select the type but if you have a specific validation type that you want to do you can
10:16
select it and then concurrency this is how many different models that will run in parallel to each
10:21
other and so we will say we will keep out as its default and click save on featureization settings
10:28
in here you can actually give it more information about the different feature types as well as
10:35
actually including or excluding things so for example you can say that will actually support
10:40
to get ID is a unique value it's an ID this is not useful for our machine learning algorithm
10:46
we might want to say that actually we don't want it to pick out a certain customer and so actually
10:51
the patterns in the data we don want it to include that date created and date completed we actually got the breakdown of these dates when we did our own featureization We might want to keep that This is a duplicate column Don add in any extra data
11:10
that's not needed because it will process it every single time. Completed tutorial, we have also
11:16
added a new featured column. So instead of saying completed tutorial, yes, no, we actually have a
11:23
one and a zero and we have the same for rating high low we've changed to one and
11:27
zero any more duplicates that sit down here we can see that duration is already
11:36
not selected because that's our target column and if we skip to the next page
11:43
something that we might want to keep in mind is that actually when we're going to use this model we won't have the completed date right we want the support
11:50
to land in the system and then we predict so we can't learn from anything that we're not going to
11:56
have at the time and click save and then click finish this is going to go ahead validate that
12:04
we've done a setup correctly and also create a new automated machine learning run it will take
12:10
us to this page where it can give us updates on the status so currently it will be about to start
12:16
it will then go into preparing phase and from there move into running and then finally completed
12:23
Once it starts running you can start clicking on these two different tabs
12:28
Data guardrails gives you information about the data pre-processing that the automated machine learning run does
12:36
and models will start to populate all the different models that the experiment has already run
12:42
and so what we'll do is we'll leave this one just here and we'll go ahead and review an experiment
12:48
I've already run. So here's one that I've already run called support ticket regression
13:04
We can see it's got the green tickers completed. It tells us how long it ran for so it took 32
13:11
minutes. It told us which compute target as well as the data set that it input and who created it
13:18
and that was me. On the left here we can see it gives us our best model summary so we can see the
13:24
voting ensemble method made a really really good impression with our normalized
13:30
root mean squared error at 0.29. It's not ideal but we could start working from here on that
13:36
We can also view all of our other metrics to just double check that it's as expected for our algorithm
13:44
And these different metrics can give us a lot of information about what is going on with our machine learning algorithm
13:52
There is a great documentation page that will tell you all about each of these different metrics
14:02
We can see that it actually chose to put all of the data in for training data
14:07
For those out there who are deep into machine learning, you'll know that you need a training and a testing set
14:14
And so it's used some form of cross validation here in order to break up that training data and test different parts of it
14:21
We can also see that this example actually has a deployed model and we'll come to that very shortly
14:28
And then finally you can see some of the run summary examples
14:33
I mentioned that you can go to data guardrails and this tells you what automatic featurization was done
14:45
And so we can see how it dealt with the validation data and can view additional details
14:51
We can see that some missing features were detected and what action that it took
14:57
took and we can also see that none of our features had high cardinality and so it double
15:03
checked that we didn't have that. Then we can go to models and we can see all of the
15:09
different types of models with all of the different types of scaling algorithms and
15:14
normalization approaches that it ran and it really does run many many algorithms
15:20
We can see that each algorithm took from about six minutes to one and a half minutes to run
15:27
and they're all now sorted by our primary metric. So we can see that the voting ensemble and the
15:37
stack ensemble methods both did the best job just behind the max abs scaler. So a certain type of
15:45
Scalar seems to work quite well here with lots of different type of tree algorithms just at the top
15:51
Again, there's some great documentation that tells you a little bit about what all of these different things mean
15:57
inside the Azure Machine Learning documentation. If I click on this model, we can dive even further
16:04
and we can see all of the different metrics that came with this specific model
16:09
as well as the ability from here to actually be able to deploy this model
16:13
but before that we had an explanation explanations is going into a slightly different topic than that
16:23
we covering in automated machine learning however I highly recommend going and having a look at the the meaning behind what this explanation is around interpretability But as you can see what it can do is tell you what features in your data set are
16:40
most important. And actually, if that is an issue, when you're building a model, for example
16:45
is that discriminating someone? Is that meaning that we're not able to, is it finding a bias
16:51
within the dataset. When we click on deploy, we can literally deploy our model right from this page
17:02
We can give our model a name. We can provide a description and choose a compute type
17:15
We offer Azure Kubernetes Service and Azure Container Instances. I'm going to choose container instances as this is a test
17:23
However, we do recommend that you use Azure Kubernetes service when looking at moving to production
17:29
You can enable authentication under keys that allows you to make sure that you're protecting your endpoint
17:36
So any kind of security that you do around containers anyway should also be done in this case
17:41
You can opt for this no-code deployment that we're doing here, or you can use your own entry scripts and dependencies if you already have that set up
17:51
And then you click deploy. It goes ahead and tells us that this deployment has now started
18:00
So I have already deployed this model previously, and so I'm going to go to my end points on the left here
18:08
Just stretch out this category here. and this Tailwind Trader support ticket system is one that I deployed earlier
18:17
When we go into here, this is now an Azure Container instance that's running in the cloud
18:21
It has our automated machine learning ensemble method deployed onto it so that we can now query it using an API
18:29
And I want to show you how you can consume it. So it's in a healthy state, which indicates that the endpoint is available
18:36
We can also see when this was created, as well as the REST endpoint that I can use as well and a definition
18:42
So if I open this definition in a new tab, this is going to tell me how we can actually query the data
18:48
And so one thing I tend to do because it comes up this very, very large JSON string
18:53
is I'll copy the JSON string, move it into something like Visual Studio Code
19:01
And then there's a really handy thing that if you do Alt-Shift-F
19:05
it will actually format our JSON data so we can read it a little bit better
19:10
And definitely take a look through all the detail here of exactly what your endpoint is doing
19:15
However, I'm going to skip down here to the actual schema. So we can see that it's got all of the different data types here
19:22
But one of the really nice pieces is that it actually gives an example piece of data
19:28
And so one of the things I always test first is if I select this data just here
19:35
and copy it. One of the things that I can do is inside something like Postman, which is a nice
19:42
UI for calling REST APIs, I can actually create a post request here that will go to an endpoint
19:50
So what do I need? I need my machine learning REST endpoint. So I'll copy that
19:54
and go back to Postman and post that in here. And as you can see, it's going to call a scoring method
20:03
under headers we can see we've got content type application json and then under body
20:10
we can see that i can actually add in the different um setups just here i will show you however that
20:21
the one that we can use as the sample does work so let me just remove that and paste in this one
20:30
instead. And if I click send, we can see that it was a status 200 and our result came back. So given
20:39
the information given here, it predicts that it's around three and a half days to actually complete
20:46
this support ticket given the metrics. And as you can see, it just returns you in a JSON string as
20:52
well. If you added on another one, for example, so you can see how you could start to populate this
20:59
data very quickly as a batch data set. We could change this to, for example, to escalated
21:06
We could change this to something like security. We could say they've not completed the tutorial
21:15
the rating is low and the month for this is actually in July
21:27
And if we click send then we can see we now get two different metrics and just given some of those
21:34
changes it says that that support ticket will be dealt with slightly quicker. And so what we've
21:40
in there is a full end-to-end of how to use automated machine learning. You use from the
21:47
automated machine learning UI, you can create your run, you can choose your best model, you can
21:52
interpret your best model, and then if it's correct for production, you can actually deploy
21:57
it to an Azure Container instance and query it using a REST API

Livestream Starting Soon

Demo: Automated Machine Learning Regression by Amy Boyd

CSharpCorner

#MeetMeAtCypher - 2024 | Meghana Jagadeesh

Logistic Regression Machine Learning | Logistic Regression Tutorial | Tutorialspoint

Reinforcement Learning Algorithms | Machine Learning Tutorial | TutorialsPoint

ChatGPT's Topical Authority Blueprint: Establishing Yourself as an Expert

Copilot in Dynamics 365 CRM Sales - Microsoft Business Application Live Show Ep.18

DIY AI with Cognitive Services

How to Use Utilities in iPhone Xs - Super Features

How to Turn On the Hey Google Voice Match for the Google Assistant on ASUS ROG Phone 7

How Canva’s AI Tool 'Code for Me' Revolutionizes Web Design for Beginners

How To Add AI Voice On CapCut (in 2025)

DeepSeek vs ChatGPT: Which AI Is Smarter in 2025?

History and evolution of artificial intelligence: Ai &Machine learning

Is ChatGPT Plus Worth Buying After a Month of Use

OpenAI Researcher turned VC on what we’re missing on the AI Bubble

The AI Doc: Or How I Became an Apocaloptimist - Official Trailer

Livestream Starting Soon

Up next in 10

Demo: Automated Machine Learning Regression by Amy Boyd

CSharpCorner