0:00
Hi C Sharp and Dotnet developers, welcome to a new episode of AttachShop.net
0:05
And I believe you all are into C Sharp and Dotnet. I suppose, right
0:09
Because that is what the show is about. In this episode of AttachShop.net, we're going to learn on how you can infuse machine learning
0:17
data science and all that into your C Sharp and Dotnet applications
0:22
The world is moving towards AI and data science and this is a must video for everyone
0:28
And to talk about today's topic, we're joined by Sam. Sam Nassar
0:32
He's a senior software engineer. He's been a Microsoft MVP since 2013
0:36
Very long. He's an MCSA, MCAD, MCTS and whatnot. So without any further ado, let's bring Sam in
0:44
Hi Sam, welcome to the show. Thank you Simon. Good to be here
0:49
Yeah, those were a lot of certifications. I kind of fumbled while reading it
0:52
Yeah, indeed. But very grateful to have them. Yeah. Thanks Sam. Thanks for doing it
0:59
You know, I'm going to let you take the stage because I know you have a lot to cover and I'm really excited for this adding ML to Dotnet applications
1:05
We all are C Sharp Dotnet developers and I cannot wait for this session
1:10
It's all you now. Excellent. Thank you. Well, so this is one of my favorite topics to talk about primarily because we see a lot
1:18
of machine learning and AI that is penetrating the marketplace in a variety of different ways
1:23
And a lot of us, regardless which type of enterprise we work in, we have a lot of data
1:29
So what is it that we can do to utilize that data and be able to build a machine learning
1:34
model out of it? So let's dive right into it. First of all, let's start with an overview
1:40
So with ML.net, what that's going to be doing, it's going to be using a model for providing
1:45
machine learning to Dotnet applications. So we're going to have an existing Dotnet application and using ML.net, we're going
1:52
to be able to add machine learning capabilities to it. So these machine learning capabilities or scenarios, they're available in both online
2:00
as well as offline. And I'll talk more about that as we step through the demo
2:05
Essentially what it's doing is it's using patterns of data to make predictions as opposed
2:09
to a developer having to specify explicit logic or a formula to be able to come up with
2:15
the results. So we just simply feed it the data and it learns from it and then it's able to make
2:21
predictions going forward. So essentially the ML.net model, it's an object that contains the transformations or the formula
2:30
if you will, for being able to take an input value and come up with an output value
2:36
For example, I can feed it a set of home prices over the last three years for a specific zip code
2:42
And then if I put in a prediction saying how much is it for a 2000 square foot home, then
2:48
it will come back and it will say it's this price based on the historical data
2:55
So some other transformations or prediction types that we can utilize, we have classification
3:00
and categorization, which is essentially taking the data that's inputted and classifying
3:06
into one of two categories. We can also look at it in a regression, meaning that we have a sliding scale such as home
3:14
prices or other scenarios where it's inversely or proportional to another value
3:21
We can also utilize it to do anomaly detection. So if anything is out of the norm, we can use machine learning to be able to point that out
3:30
Another one of my favorites is recommendations. So we've all used a website called Amazon.com
3:35
And every time you make a purchase, it always recommends something else that will go along
3:39
with it. Right. So we can use that machine learning model to make those recommendations as well
3:46
In addition, there's time series sequential data where it learns off of historical data
3:51
And we can also use it to do image classification where I feed it an image and it will classify
3:56
it as whether it's a fruit or a vegetable, depending on how I train the model
4:04
So some of the examples of linear regression, we see that on the left, we have a proportional
4:10
scale where we, again, it could be an example of home prices rising over the years
4:16
And then there's other examples of inversely proportional, as you see over on the right
4:20
hand side where we have the HDL or the good cholesterol. It's inversely proportional to the BMI or the body mass index
4:29
So the heavier the weight of a person is, chances are they're going to have a lower
4:34
good cholesterol level. And so that is an inverse proportional relationship. And there you see what the triangles that are listed on the graph
4:43
These are outliers where you have some extreme cases. But for the most part, it's essentially a inversely proportional linear regression
4:56
So some terminology that we need to cover right off the bat
4:59
You'll hear the word feature. What a feature is, it's an input for a machine learning model
5:04
So this could be the size of the home. It could be the mileage on a car
5:09
It could be some other input value that I'm training it on
5:12
The label is the output value that I'm getting from the machine learning model
5:16
So for example, if I'm feeding it in the size of the home, I'm getting the price for that home
5:23
Likewise, if I put in the mileage for a specific vehicle, it will give me the current market
5:28
price for that vehicle based on the data that I trained it on
5:34
So essentially, we have input and output. Now what goes in between is the transform
5:39
That is the type of prediction that is going to be learning on the data that we feed it
5:43
And then it will be able to handle the feature or input going in and then be able to produce
5:48
a label or an output going out. So that's some of the basic terminology
5:53
Now what does the model development look like? How do we get to a stage where we can take data and be able to make a prediction out
6:00
of it? So looking at this image, we see that we start off at the very top where we collect
6:07
and load the data, because that is crucial for machine learning. Then we create the pipeline using the append method
6:14
And then we train the model using the fit method. Afterwards, we then evaluate the model to see, is it in fact working according to what
6:24
we have trained it on? So I would give it a sample value for a home price or the mileage of a vehicle
6:30
And then I would get the output value. And then I can determine whether that's correct or not
6:36
If it's not, then the cycle continues to repeat again where I'm feeding it more data to train
6:40
it more. And again, I use the append method, the fit method to train it, and then evaluate it
6:48
Once I get to a point where I'm satisfied with that model and the values that it's producing
6:53
using the evaluate method, then I can move on to the lower portion of the graph where
6:58
I am saving the model using the save method. So now I created a model or a transform
7:04
And now I'm ready to use it. After I use the save method, what it does is it saves it as a .zip file
7:12
Then I can take it into another project in my solution, be able to load it using the
7:18
load method. And then I can make predictions, as you might have guessed, using the predict method, as
7:24
we see at the bottom of the slide. So essentially, it's a cyclical process
7:30
Once I'm comfortable with the values that it's producing, then I can save the model
7:35
load it, and be able to make predictions going forward. Now let's get into a level of detail
7:45
So in order to be able to get to that point, I need to have some prerequisites on my machine
7:50
So first of all, I would need Visual Studio. And I would need to identify that I need the desktop development environment
7:57
And more specifically, as you see circled in the pane on the right, ML.NET Model Builder
8:03
So once those are selected, now I'm able to utilize machinelearning.net in my project
8:11
And I can do this in one of three ways. I can use the UI, where I simply right-click on the specific project, select Add
8:19
And then I'll see machine learning popping up as one of the items that I can add in in
8:22
the context menu. I can also do it programmatically by adding the NuGet package for Microsoft.ml
8:29
Or if I'm so inclined and I'm very comfortable with the command line, I can use the command
8:34
line interface to be able to add machine learning into my solution
8:40
And so with that, like all things, it's better explained through a demo. So let's jump into it
8:46
And what I did here is I created a very basic project
8:52
So this is a console application. And I selected all the defaults. I called it the dash demo for the dash of .NET
9:01
And again, I haven't changed anything. I just simply selected all the defaults. It's bare bones, as you see over in my Solution Explorer
9:08
So now, how do I get started by adding a ML.NET project to it
9:12
So I would right-click on the project, select Add. And there I see the machine learning model as one of the menu options
9:22
And again, this is available because I went through the prerequisites of installing the
9:26
desktop environment in my Visual Studio. And I was sure to select the machine learning model
9:35
So once I go ahead and click that, now it's going to walk me through a couple things on
9:39
the menu. And so it's asking me, what should we call this project
9:44
Let's go with ML model 1.mbconfig. I'll accept the defaults. And add
9:54
It'll take a minute to chug through and add that in. And now it's going to run me through a GUI interface where I'm going to select the scenario
10:03
and identify the various factors. So if you recall, we said that we can use machinelearning.net to be able to do classification
10:11
regression, value prediction, a variety of different things. So these are all the different scenarios that are available to me
10:18
And as you notice, in the lower right-hand corner of each one, it specifies that it can
10:22
be done locally. And in some cases, I can use it locally or I can utilize the capabilities of Azure so
10:29
that I can utilize some heavy-duty computing power. And in some cases, like object detection, it's only available in Azure
10:39
So for this specific scenario, I'm going to do data classification, which is going to
10:43
select one of two categories, depending on the input that I feed it in
10:49
And so it's giving me a summary of my training environment that it's going to be on my local CPU
10:56
This is the type of environment that I have. And next, I'll move on to the next step
11:02
And here, I need to specify the data. So I have a file, and we'll point to that in just a second
11:09
We go over into the data. You'll see that I have a file called yelp-label.txt
11:15
Let's take a look and see what that data looks like. So this is nothing more than a flat text file, as you can tell
11:25
And basically, this is a dump of a set of comments that were retrieved off of Yelp for a
11:30
specific pizza restaurant. And so the comments are classified as either 1 or 0, 1 being positive and 0 being negative
11:39
So you see comments like, wow, love this place, and that is a positive comment
11:45
Crust is not good. Obviously, that's a negative comment, and so on and so forth
11:49
And as you see, I have approximately 1,000 data items here that I can use
11:58
So we'll go ahead and select that file. And as it's loading it, it is now identifying the data
12:06
And the next thing it's going to ask me is, which column do I want to use to be able to
12:11
predict the value? So this will be my label. And if you recall, feature is the input, label is the output
12:18
So looking at the sample of my data, it's giving me a preview of the first 10 rows
12:23
And I want to be able to use column 1 as my label
12:27
And so with that, I'll go into the dropdown. And let's go over here
12:36
So we have data classification. All right, so there I specified my file
12:43
And then I'm going to specify column 1 as opposed to column 0
12:47
So column 1 is my label or my output. Next step is it's going to ask me, how much time do I want to spend training it
12:57
So think of training as if you're training a, for lack of a better word, or educating a
13:02
three-year-old, right? You're showing them what is the right thing to say and what is the wrong thing to say
13:09
And so it's the same thing here. And so with all the different comments that I have, I'm going to show it which ones are
13:16
positive and which ones are negative. And so I'm going to select 10 seconds for starters
13:23
And I'll say train again. And it will run through it. And as it's running through it, you'll notice in the lower window or the output window
13:32
it's going through a variety of different trainers, as we see here, within those 10 seconds
13:36
And it's cycling through them, trying to see which one's a good fit. It says not my day
13:45
Well, one error after the other. All right, let's try that again
13:49
So train again. OK, so we trained it for 10 seconds. It ran through a variety of different trainers down below
14:01
And it gave us some data as far as the accuracy that was produced. And then the training results, as it shows us here, the best accuracy is 0.6911
14:11
The best model that it found for training was the Fastree OVA
14:17
And so if we're satisfied with those results, then we can move on to the next step
14:21
If we're not, we can just simply increase the training time. So we'll set this to, let's go 20 seconds
14:27
And we'll do training again. And let me expand the window at the bottom
14:35
And then you'll see that it's still continuing to cycle through for the specific training
14:39
time of 20 seconds. It's going to cycle through the rest of the training time
14:43
And then if we're satisfied with the specific training time of 20 seconds, it's going to
14:47
cycle through a variety of different trainers to see which one will produce the best results
14:54
on the given data. And so if you recall, the first time we trained it for 10 seconds, the results were 0.69
15:03
I believe. But now, because we boosted it up another 10 seconds, it gave us better results of 0.70
15:12
So obviously, our accuracy improved as we gave it more time to train
15:21
So if you recall from the diagram that I showed earlier, we collect the data, we create the
15:26
pipeline, we train it, and then the next step is to evaluate it
15:30
And that's what we see over on the left-hand side. So when we click Next Step, it will take us to evaluating the model
15:37
So if I pass in a value that says crust is not good as my input, what is that prediction
15:43
going to be? So it'll chuck through the model that it just created
15:48
And it should produce a result saying that this is a negative comment. And as we wait for the results, you'll see that it says this was a negative comment or
15:56
a result of 0 with a confidence of 76%. For the sake of time, I can just move forward
16:06
But if I'm not satisfied with these results, I want higher confidence scores, then I would
16:10
again, go back to the training process and then train it for a longer period of time
16:15
giving it more data. But since 76% is pretty reasonable, I will move forward
16:23
And now it's asking me, how do you want to be able to consume this model that we just created
16:28
So now that we're satisfied with the results, it's ready to save it off. And we can incorporate that into another project within our solution
16:35
And I can do that in a variety of ways. I can utilize the code. It's giving me a code snippet up in the top window
16:42
I can also add it to the solution as another console application, or I can add it as a
16:48
web API to my solution. So for the sake of simplicity, let's select Console Application, add it to the solution
16:57
And so for this solution, you'll notice that I already have one that's preexisting called Model 1 Console App
17:04
But let's go ahead and create that new solution
17:19
So let's go over to Consume. Now we're ready to incorporate that
17:23
And we'll do the console application, add to solution. We'll call it ML Model 1 Console App 2
17:32
We'll add that to the solution. And then we'll see what's going on on the back end
17:36
So we'll give it a minute to run through what it needs to do
17:49
So what I'm expecting is it's going to create a third Console App 2
17:54
And it's going to be a console application. And there it is
18:02
So if we take a closer look at that, we'll see that it's a console application
18:07
And like all console apps, there's a program.cs that's included. So this is all auto-generated code by the ML.NET Model Builder
18:18
And then we'll see that we have sample text or sample data where it says, crust is not good
18:24
This is the first row of the data that it took from the file that we specified
18:28
And it's going to pass it in as a prediction. And then in the end, it's going to make the prediction and then be able to produce that
18:38
in the output using the console.write line. So if we run through that second application, it'll take just a minute to spin up
18:51
And there's my console window. And so it's going to automatically pass in the crust is not good phrase
18:58
And then it'll produce the results on the screen for us. And like all good things, it's worth the wait
19:03
So it's just a matter of seconds before it compiles, runs, and then produces the results
19:09
As we see here, going through the program.cs, it generated the crust is not good as a starting phrase
19:17
It produced the results on the output window. And it shows us that this is a negative comment
19:24
And so to be more specific, if we look at the code, we're using this ML model one
19:30
What ML model one is, it's the model that is saved off
19:35
And so that in a nutshell is how we're able to utilize machine learning.net within an
19:41
existing .NET application. We trained it on some existing data. And then we're able to make predictions out of it
19:52
And so some concluding slides. So the time to train, this is roughly a rule of thumb, depending on the amount of data
19:57
that you have. But typically, the more time you give it, and more data you give it, the better the
20:04
results will be. And so that dovetails into the next question. How do we improve performance
20:12
As mentioned, giving it more data. But not just data. It needs to be clean, meaningful data
20:18
For example, I gave a sample scenario of home prices where the price is related to the square footage
20:26
But obviously, with a home price, it's related to a lot more than just the square footage
20:32
You have the area. You have amenities within the home, number of bedrooms, number of bathrooms, the zip
20:38
code that it's in, a variety of other factors, whether it's a fenced-in backyard, a swimming
20:43
pool, so on and so forth. So there's a lot of other factors that influence the price of the home
20:50
And so that's what we mean when we say that we need data with context
20:54
So we need to scrub the data, make sure that it is clean in the sense of there's no duplicate values
21:01
There's no erroneous values. There's no outliers that don't belong there. But more importantly, we need to give it context of other things that could affect the price
21:12
of that home. In addition, we could use a method called cross-validation, which is basically a technique
21:18
that splits the data into several partitions. And we're able to do machine learning on those different partitions
21:25
And then the result from those different partitions is then gathered. And it's a more effective way of producing a more accurate model
21:34
Also, trying different algorithms. As you had seen, we walked through a variety of different algorithms or trainers
21:43
And we were able to come up with the best result, all thanks to ML.NET. It did all that for us behind the scenes
21:49
All right. So as a recap, what we did was we took an existing .NET application, or in my case
21:54
it was a very basic bare-bones console app. We pointed it to create a new ML.NET model builder
22:03
We specified the file. We specified what's the training column that we need to specify...that we need to..
22:10
We specified the column, which is going to be our feature, and then the column, which
22:14
is our label. And then we're able to produce results or a prediction model out of it
22:20
And then we tested that using the evaluate method. And then we were able to produce a final product and be able to make predictions going forward
22:29
And so, hope that was helpful. If you have any questions or comments, you can always reach out to me later
22:39
My contact information, it's available on Linktree slash Sam Nassar. In addition, you can always email me, snassar at nistechnologies.com
22:50
My Twitter handle is at Sam Nassar. You can also find my blog at samnassar.blogspot.com
22:57
And lastly, I invite you to connect with me on LinkedIn if we're not already connected
23:01
And so with that, thanks for having me. And I certainly hope to hear from you
23:05
I've had several people reach out to me, starting their ML.NET project, and I invite more people
23:10
to do so. So looking forward to hearing from you. And thanks for having me, Simon
23:17
That was a great presentation, Sam. You know, I think not just for C-Sharp or .NET developers, this can be a crash course
23:23
for anyone who wants to get started with machine learning. Because you covered the basics of machine learning, and then you showed a very important
23:30
algorithm that is linear regression, and then you had a demo. And of course, a demo without the demo daemon is not the complete demo, right
23:38
And I personally like the slides where you said, where you talked about the rule of thumb
23:42
I've never seen that, but I'm going to take a reference from that slide
23:47
Thank you so much, Sam. That was absolutely great session. Everyone, you can reach out to Sam on these details, and you can also find the details
23:54
in the video description. I'll see you in the next episode of Adagio.NET. Until then, take good care of yourself and see you soon