0:30
Hello everyone and welcome to AI42. Hello Yves, hello Leyla, where are you
0:37
Hello, thank you to have me again, thanks so much. Yes. Icon and Yves, thanks so much
0:43
So we should say here for the viewers that this is actually part 2 of a series
0:49
but I think that you will get quite a lot out of it, even if you haven't seen the part 1
0:53
but if you want to see part 1, you can just go to our YouTube channel channel and you will be able to follow the first part in this two-part series. How are you, Yves
1:03
Thank you for asking. I'm feeling amazing. I lost my voice in Las Vegas, but otherwise all good
1:09
Maybe I should ask you, Yves, do you know what day it is today? It's a special day today
1:15
Oh, it is the 14th of December, is it? Yes. No, 15th actually. That's the 15th actually
1:22
But it's also the day where we have our 21st session for AI42 for this year. Wow
1:30
So that's something to celebrate. Yes, that's it. Great. Good news. Thank you, Håkan
1:38
So I think now we can just have a quick introduction here to AI42 before we continue
1:48
So let us do that. Yes, welcome to AI42
2:06
And the motivation here for starting AI42 comes from the recognition that we believe that there is really no good starting material for getting into this field here
2:15
So AI42 is a strong team consisting of three Microsoft AI MVPs
2:20
So it's me and Yves Ipari, and then Gosea Borsika. She couldn't be with us tonight, but usually she's also here
2:27
So what we do is we provide you with a valuable series of lectures that will help you to jumpstart your career in data science and artificial intelligence
2:36
And the way that we do it is we provide you with the necessary know-how so that we can help you land your dream job
2:42
as long as it's related to the fields of data science and machine learning
2:46
And the concept is quite simple. So what we do is we have professionals from all around the globe
2:53
They will explain to you the underlying mathematics and statistics and probability calculations
2:59
and also data science and machine learning techniques. And we will guide you through all of this
3:05
So all you have to do is follow our channel and enjoy the content every second week
3:09
So it will be filled with real life cases and also expert experiences
3:14
And don't worry if you think this sounds difficult. We will guide you because we have also started from scratch, so we know how it feels
3:22
And you can always stop and rewind the videos, and you can ask for clarifications in the comment
3:27
section, and we will be able to help you. And we hope to assist you on this wonderful journey
3:31
and help you as a speaker on our show one day also. And we believe that by creating these types
3:38
of cross-calibrations with other organizations, we will be able to give you the best opportunities
3:43
so that you can broaden your network both in the AI and in the data science communities
3:48
And with the combination of our offered services, we would also be able to support less fortunate
3:54
people and organizations that are not recognized yet, even though they deserve it
4:01
And our shows are sponsored by Microsoft and Maez, and we wanted to say a big thank you for that
4:07
also. And we are humbled by the lot of help we get from our contributors as well
4:12
Leventa Ponger makes us amazing graphic designs for our slides that we show during the stream
4:20
and that you can see on StreamYard as well. And me and Mari is creating our intro music before
4:26
each streams and we are in close collaboration with C-Sharp Corner and Global AI Community so you
4:33
can see our shows on their own YouTube channel as well in advance to what we show on our media
4:40
And Nicolette thought create or text content and also reviews them that we use for during the sessions or during sharing something on our web pages or our social media
4:58
You can follow us on Facebook, Instagram and Twitter. So you can find some information about our upcoming sessions or some other fun information
5:09
and if you want to re-watch our videos you can go to our youtube channel
5:14
and if you want to be updated about our upcoming sessions you can go and see that on our meetup page
5:21
yes and we also have a code of conduct so our code of conduct
5:27
outlines the expectations we have for participating in our community as well as the steps for how you can report unacceptable behavior
5:36
and we're strongly committed to provide the welcoming and also inspiring community for everyone
5:42
So be friendly and be patient, be welcoming and also be respectful with each other
5:48
And you can find our code of conduct here on this link here. Yes. And I think with that, we can go back to our speaker
5:59
Yes. Yes. Hello again here. So I think it's time that we can actually formally introduce Leila here
6:19
or what she will talk about also in her session here. So Leila is a Microsoft AI and data platform
6:28
MVP since 2016, and she was also the first AI MVP in New Zealand and Australia. And she has a PhD
6:36
in information systems from the University of Auckland. And Leila is also the co-director and data scientist
6:43
in Radacad Company with many clients all over the world. And she's also one of the bloggers of Radacad
6:49
with more than 800 articles and more than 9 million readers around the globe on an annual basis
6:55
And also the co-organizer of the Microsoft Business Intelligence and Power BI user group, the SQL Saturday Auckland
7:02
the DFINITI conference and the Auckland AI global community. And she's also an international speaker at most Microsoft conferences
7:12
and has facilitated over 200 sessions and full-day workshops. And she also loves solving jigsaw puzzles, playing board games
7:21
and her dog's company. So we are very very thrilled to have you here with us Leila Thank you Thanks so much Hakan and Eve And also Gashi I know you are a very good team
7:35
It was really great to have one session with you guys last about two weeks ago, I believe
7:41
And now I'm going, yeah, that's my pleasure. Thank you. And we are especially humbled because it is quite early there where you are, right
7:51
Yeah, so kind of, but it's good, actually. It's 5 a.m. in New Zealand
7:57
So I think the rest of the world is 15th of December, here it's 16th
8:02
But it's fun, kind of. It's a fun thing to speak about AI and kind of be with the community
8:10
AI community people. So it's fun for me. Thank you. And it's really fun for us as well
8:16
Could you give us a little recap of what we talked about last time? Yes, sure. Last time, actually, we talked about the overview of the Azure Machine Learning
8:31
We see that what is the machine learning and workspace. We have a very quick look on automated ML and also Azure ML Designer and Notebook. That's how they are
8:45
So today we go a bit deeper to Azure ML Designer first
8:49
to create a pipeline and then deploy it and use it inside a code like Python code
8:55
or it can be a C-sharp code. And after that, we're going through
8:59
how we can create a pipeline using the Python inside Azure ML Notebook
9:05
So we're going through a bit deep on how we can actually create a web service
9:10
and create a pipeline through that So everything becomes kind of as an orchestration and it's become end-to-end process in a one kind of structure
9:24
Yeah, I'm really excited because this is one of my favorite tools actually on Azure
9:29
So I can't wait to see something about it. Oh, thank you
9:34
Thanks so much. Thank you so much, Akon. Yes. Yeah. So we'll leave the stage over to you, Leila
9:41
Thank you. Thanks so much. Cool. Hello, everyone
9:56
Welcome to my session about Azure ML orchestration and how we can actually create a web service through that
10:05
As I actually mentioned, I'm based in Oakland, New Zealand. All of introduction has been presented
10:12
But if you have any question after the session, this is my Twitter and my email address
10:19
So I can actually be in touch with you and ask me a question
10:23
If I know the question, I will answer. Otherwise, I'm going to search
10:26
And it's good for me to hear new questions because it's become a new idea about what is happening
10:34
Thanks so much, everyone, for the things, for coming, joining. This is not my last session for the 2021
10:41
Actually, I have another one about Market Basket ysis in next six hours, and that's
10:46
because my last session for 2021. Now, I'm going to talk about actually, you remember we talked about Azure ML Designer
10:59
In this day, we are going to look at that, how we can create Azure ML Designer, how to
11:05
train the model, and then after that, how we can actually create a web service out of
11:10
that and use it so in the first part we are going through that so i assume that you have some
11:16
knowledge about how we worked with azure malice space but still i will go through that one so
11:21
let's go through directly to the azure ml so if you remember we kind of we start we create a working
11:31
space that help us to kind of create the azure ml experience through that after we connect to the
11:39
workspace over here. I talk about notebook and Azure ML, kind of automated ML and notebooks. Today, I'm going to talk about one of the scenarios that we
11:52
have and that is a creating model here. For some of you maybe it's familiar, so I'm going just very
12:00
fast on the model that I've already created and show it to you. But before going through that
12:06
Let me create a simple one and then I'm going to talk about that
12:11
Just I want to share actually some resource for that. Before I'm going through that
12:19
There is Azure ML fundamental examples that provide really nice example about using automated ML
12:27
There are workshops through that about using Azure ML Designer for a regression classification clustering
12:35
I find them is really step by step and is really helpful
12:40
I'll share the link through that. Also, I will share the link with Eve and Hakan so they can actually share it with you
12:49
They are really nice, but I think for people who are just starting or they're actually familiar with these tools, it should be great
12:56
So let me share it on the private chat and they will share that one with you guys
13:02
The first step that I'm going actually to do is just bring some data set over here
13:11
There are some, actually you can bring your own data set or you can use some sample data set over here so that I use the automobile price data
13:22
This is a data set that we actually have here. So as you can see, you can preview the data over here
13:32
What is the data about? There are about 26 column, 250 rows
13:39
So as you can see here, you can drag and drop other things through here
13:44
So for example, I just do some example over here. I want to select a column over here
13:51
I don't want all of the things to be here. I just connect the output from there to the input over here
14:00
and we select some of the parts of the data that we need
14:05
I'm just going to edit the column. I can search by a rule
14:13
like columns name and the other, or I can actually search by name
14:18
I think we need everything, expect the column that talks about normalize
14:24
loss or we don't need that one. All together, we need about 25 columns
14:30
That's the one. You see that I don't write any code. All of the steps is happen here
14:36
I can clean missing data. Here I can do the cleaning of the missing data
14:45
This is so similar to Azure ML Studio that we used to have so for example here again you can do kind of the things here so I think we have missing value on boar and stroke i believe if i right
15:12
enough there one of the things that we have let me I think maybe I need to kind of run it and then sit sometimes a bit
15:31
kind of not able to show that one because we didn't run the code so here we want to both the
15:38
score and the horsepower so that's the one that we need so you can see that is actually there are
15:45
different ways of doing that. You can substitute that one or you can remove entire rows and you can
15:52
actually kind of do all of the data transformation over here. So this is a process that we're creating
16:00
because this takes a bit time. So I'm going to show you the one that I've already created
16:05
So let me go through the one. So this is the one that I already created. So this is my data
16:13
I select some of the columns over here as you can see if I just click on the edit you can see that
16:20
I do that and then clean missing data so just some of the value that has missing value
16:27
then I bring all of the data in a one escape because one of the things in machine learning
16:33
that because most of the algorithm using the distance space for example you see the distance
16:39
or other. If we don't bring the data in the same scale, it can impact on the algorithm quality
16:46
So it's recommended if you have numerical data and you're using regression algorithms
16:51
it's better that actually you bring your data in the same scale. So some columns maybe you are in
16:58
the range of, for example, zero to 100. Another one can be between 100 to 1000 or between zero
17:09
to one. So it should be that everything to be in a one scale. One of the approach that we can use is
17:15
min max that bring the data between zero to one, but actually we have z score, logistic
17:21
and log normal, and others. So when we use these things is actually is slightly can be different
17:31
from the algorithm that we are going to use. So there are some rules in a statistic kind of
17:36
concepts that when use which algorithm. However, so with this one, I just use a simple one that was a min-max
17:43
These are the columns that needs to be transferred. Definitely, we couldn't transfer columns that they are
17:52
categorical or they are actually text. It should be a numerical value
17:58
These are the data transformation top. As you can see, no very simple one
18:04
I always recommend that. Yes, in here, if you want, you can apply some SQL data transformation
18:14
Yes, this is possible. You can bring your code over here. But the environment for
18:22
our Python or SQL that we have over here, also we have the execute R over here
18:31
I'll also actually use one of them later. So for example, for Python, yes, there is the environment that you can write for data transformation if you couldn't find any data transformation here using R, Python, or SQL
18:45
But if you, it's better, the best practice is actually you clear your data before bringing it here
18:54
So at least the data has been gathered. For example, if it's connected from different resources, better bring over here and then do some primary data transfer permission for machine learning over here, not just for cleaning the data
19:10
It's better to be before loading to this environment. So and after that, so this is our cleaning data and then we're going to split
19:20
So as you can see, 70% has been used for that one
19:24
Just an overview if you forget about Azure ML Studio. The 70 percent for training goes from here
19:35
the node located on the left side, and the other one from right side for testing
19:42
This is a very simple one. These are the model that I have
19:46
I have two algorithms here, linear regression and fast force quantile regression
19:53
They come here, we chain them with the dataset that we have over here
20:00
and we test it and evaluate it. After you create this one
20:06
it's a time that actually you click on Submit and it's going to run your experiment
20:11
When you click on Submit, it actually asks you about the compute that you need to have
20:20
It's actually about the compute instance. Before I'm going through that, let me show you that
20:26
If you remember from previous session, just copy that one. We have a couple of different things
20:34
We have compute instance and compute cluster. So compute instance is mainly used for the training of the data
20:43
Attached compute is mainly used for when you're using data breaks, connection between Azure ML and data breaks
20:49
Compute cluster has been used in Azure Notebook when you want to instantly run and predict and use that algorithm and predict
20:59
An inference cluster can be used when you're going to kind of do that in the IOTH and the others
21:05
So for this example, as you can see here, when I click on top of it, it's going to run on the default as a compute instance that I have over here
21:17
As you can see, it's running. is a virtual machine. That's actually, you can see the specification over here
21:26
You can select a new one or the current one and then click Submit
21:31
It's going to run all of the steps through here. After you've done that one
21:38
this is already Submit for me. I can just do it again
21:43
Submit that one. Let's take a bit of time to Submit. After that, we are going to create a training pipeline
21:57
Now, we are going to create a web service out of that
22:04
so we can actually use it on the other. As you can see, when you submit that one
22:09
you get two inference pipeline over here. Let me see if I can zoom so you can see it properly
22:18
here So here we have to create real inference pipeline the one that I going to use or batch So batch is actually when you going to send for example a file for the prediction or the other size
22:35
a batch file, but in the real time, it's just a request and respond
22:39
just one input send and one receipt. We are going to use that one because this one is
22:46
actually the organizational one and we may need it, we don't with the GPU to run that
22:51
But for this one, it's actually help us to kind of create a simple one
22:58
So I'm going to click on real-time inference pipeline. It provides a message to kind of do that
23:09
So yes, for that one, it has two kind of things. So I'm just very fast
23:15
Remove the other one. So let me remove that one. So I'll just stick with the one
23:23
Again, I submit. And then we are going. So you need to choose which algorithm you want to use over here
23:31
So here, as you can see, I have a couple of them. So I create them to simulate the automated demo
23:38
to see that which algorithm has the better evaluation. So now is actually the time that I can create the inference cluster
23:47
So just wait till it creates something for me. Okay, so now it's actually, as you can see here, it's going to create a real-time inference
23:59
It's a bit different from what we have from previous. So how we can see them
24:05
Let's be back to the experiment tab so we see that what is happening over here
24:11
So as you can see today, so we have experiment regression over here
24:18
And under that, if I just click on that, we have a kind of create a pipeline for that one
24:29
Let me back to the designer. I think it's better to showing here
24:33
So you see that is actually, this is my training part. So again, I can back to that one
24:38
So it's still over there. I can change that one, whatever I can do
24:43
but also is create another one for me that is the real-time inference that is a bit different from
24:52
what I have over here. Let's see that what is the difference that we have here. We have
24:59
web service input over here, so it's actually a place that you can bring that one
25:05
Here we do the select. We do apply some transformation over here
25:11
This is all algorithm that we have. As you can see, you couldn't change the algorithm
25:17
It's become like a package. It's become as a black box over here
25:21
It's kind of that one. These are the steps happening through that
25:27
We have a web service output. We don't need some of the components over here
25:33
because this is not any more training part. So the one that you can see is actually encapsulate everything into some packages
25:42
and provide the input and output schema for calling the web service
25:50
So definitely, we don't need the evaluate model, so I'm just recommending to delete that one
25:56
We don't need the original data set. We just need a sample data set that we're creating by intermanually
26:04
So you need to remove that one as well. And instead of that, I'm going to create the intermanually data
26:15
So I just copied that one through that. So I just provide some data over here
26:22
So this is just samples of the data over here to just create a schema through that one
26:32
So this is the one, and then I connect it through over here
26:37
We need to kind of also select, kind of edit that one
26:43
So here, we kind of, for the selection, we need to change some of the parts of the data over
26:52
here um so here is not any more prediction so when it's not a prediction that means that uh
27:00
sorry it's not any training for the training um because we want to predict the price so
27:06
for the previous pipeline we include the price but for this one because we just want to predict
27:15
the price. So there is no need to actually to include that one. So I'm just going to exclude
27:22
the other one that is price. If it's showing over here for me. Yeah. And I think that should be
27:33
okay. So just that one. So that's another data transformation. Again, here is not a training
27:39
because it's just a prediction. So we want to predict the price
27:44
So we don't know what is the price. So we need to kind of exclude the price from that one
27:51
So another one that we have is actually is after the score model
27:59
So it's actually we test the model. But to show the output to the people, it doesn't show me the output now
28:07
because it's not run over here. But we need to put some of the variable as an input
28:16
and some as an output. So we need to change some of the parts of that one
28:21
So I just want to show the predicted price and the score label
28:27
I don't need the other one. So to do that, there is a code
28:32
that you can actually use as a Python card. So I said execute Python code over here
28:40
And I'm going kind of transfer the output from here. So I just remove that one and kind of let that link and do some data transformation for showing what we have
28:57
So this is the code that I'm going to use. So just click that over here
29:02
Let me share the link that I used to doing that. Again, with Hakan and Yin, so they can actually leave that one
29:19
Thanks so much for sharing that one. And so, yeah, I think everyone got that link over here
29:27
So this is a link that we have. So here, what we do actually, we get our data through here
29:34
And then, as you can see, we create a data frame, kind of, that is as a score label
29:41
So the output just should have a score label and predicted price
29:45
So this is the one that you have. So you can see that I use just Python for the simple parts
29:52
We never use it for the, kind of, the whole process. But still, if you have a specific algorithm, you can use it
29:59
So here, I use Python just for data transformation. So the output, I just kind of as a select
30:06
I use it as a select approach to just show the score label and predicted price
30:12
So just back to the what we have over here, I think we should be good to go
30:21
And so I'm going to submit. So I'm just going to submit that one to first run it
30:28
Yes, select the existing one that is regression. And on the compute instance again, so I just click on that one to stop it
30:46
So it shouldn't take that much time, so it's going to actually do that
30:49
So this is a bit misconfusing for many people because it has the same structure of training
30:55
But this is not the training part when you are in the real-time inference
31:00
That means that you are setting up what your input web service should look like
31:06
as you can see here, and what is the output should look like
31:11
These are the encapsulate packages that has been there. We can call them as a pipeline over here
31:20
As you can see, that all of the steps about what is the manually data should be looked like, what is the web service should be looked like, and everything is actually will be written here
31:31
So these are the steps that we already train our model, and we are going to use that model for the prediction
31:38
So that's the one. It may take up a couple of minutes, and then we're ready to go through that one
31:46
As you can see, all of the steps happen parallel to each other
31:52
If you want to see, I can show you the actual steps
31:56
Again, what I'm doing, I'm going to look at the modules that is part of the AI 900 exam
32:06
that is creating a regression model with Azure Machine Learning. So all of the steps about what compute you need to use and the other information has been over here
32:19
So about the explore the data and everything you can see over here
32:23
So the one that link that I shared to you is really good one
32:28
So I really recommend to follow this one because this is a one that's among the most of the training material that Microsoft had
32:39
I think the AI 900 one is one of the best. Really good design, really step-by-step
32:47
and you can easily get that how it actually works. Okay, so just wait a minute
32:53
I can also check the questions before I'm going ahead. So thanks so much for sharing the sessions over here, the links
33:07
Okay, cool. After it's actually finished
33:17
we are going to test it. We are going to see that how actually we can use it in other applications
33:25
I'm not going, oops, I think there's some problem on the one
33:30
I think maybe I missed some of the steps over here
33:43
Let me see. What was that? So I just need to stop it
34:00
Okay, I'm sure that I maybe missed some of the parts over here. Let me double check
34:12
I think it should be price. Let me double check maybe I made a mistake over here. Let me check the data
34:34
I think it should be okay
34:47
Let me check the price. Maybe I shouldn't include the price over here
34:55
Just be careful about what we put here is a very kind of..
35:00
Maybe I didn't put the price over here. Yeah, there is no price over here
35:06
so it's become a bit confused through that one. So that's my fault, actually
35:12
Doing that, I'm just going to remove that one and submit again
35:18
So this is a process that actually we're doing, so it's going to submit the process again
35:25
Let me double-check that. Actually, I can use the other ones that I already have it before
35:33
I think that's kind of a bit much better. Let me see if I create run before
35:48
Yeah, it's still running. Let's see that if it's going well, I think it's running here
35:58
Okay. After we actually done that steps, now we are going to deploy it
36:06
What I'm going to do to show that one, so I'll just keep it here
36:12
I'll go to the notebook and I just copy a code over here to test it
36:17
Here you can add a new file, so create a new file
36:21
you can call it as a test regression, and it can be a Python code over here
36:30
It's a notebook Python over here. There's a type one and then just we create that one
36:40
We'll just create that one. You're going to test the code over here
36:46
and see that how it's actually created. But before doing that, we need to apply some of the steps over here
36:53
just waiting till it is done okay so so after actually this process finish we able to deploy it so when we actually we deploy it we get
37:17
an end point over here and let me see if I have a one from the actually previous one so you see that after you deploy it click on deploy you get actually under the endpoint if you can see here we have endpoint over here
37:36
Under the endpoint, if you click on that one, this is for another example, you get the consume
37:42
Under that consume, you get the rest endpoint over here that you can copy there and use it to doing that
37:49
or you can able to test it over here. So this is for the Titanic example, and you can see that actually how you can test the data over here
38:00
So if I just put it here, so I can just simply test it, and it provides the results
38:07
So these are the tests for that one. So the same process will happen over here
38:13
So I'm just going back to the designer that we have. It's still running
38:18
So just, yep, yep, we're almost there. And then after that, we can actually consume it in our data
38:30
So this is a very simple code. Here we have an end point, and we have our keys
38:35
and this is sample data for the predicted price of the car
38:40
And just we can see the result over here. So just the last part is going to be done
38:48
okay so i think it should be finishing a couple of minutes over here and then we can actually
39:03
click and deploy it so just doing that so whole of the process is actually because it's going to
39:12
run and check everything is that that's why we call it as a pipeline because all of the data
39:17
process of the creating and testing happen over here. So if you want to look at that
39:24
so here we're going to connect to the workspace, we create a training script
39:28
So all of these parts is going to be encapsulated in one part
39:34
And then we are going to run the script and create one
39:39
So let's see that how it's actually process. Done over here, I think
39:44
Almost finished. Yeah. and then be able to deploy it. So this is a very, this is not an end-to-end pipeline
39:57
So the pipeline that you can see here is actually, you already create a machine learning
40:01
is going just to pipeline your data cleaning, and also your machine learning process
40:07
and executing as a web service. So it's not the pipeline that actually parameterized
40:14
the algorithm or the other, but still it encapsulates some of the steps that we already have
40:20
So about the algorithm, about some of the data transformation, like normalize the data and everything has been here
40:28
So after it's done, we'll be able to deploy it. So you can click on that one
40:35
call it as a regression price. auto-win price
40:45
So here you need to, it should be lowercase as my bad
40:53
same as the most of Azure things, so should be lower space
41:00
So here is actually, you have to compute type. So if you using it for the batch processing and you want to use it in
41:09
organizational scale, so you need to use the Kubernetes, I call it AKS, it's similar for me, or ACI
41:20
ACI is when you actually just want to deploy it for your local things
41:26
and it's not in a huge scale. If you are going to use it in a deployment one
41:32
you need to use AKS, but for this scenario, I'm just using that one
41:37
Here, you can enable the, if you want to see how many users use that one
41:45
you can enable the application inside, provide a report that how many times they use your endpoint
41:52
You can enable the SSL. I'm not sure about that, so I need to double check
41:59
You can check the CPU reserve capacity and the memory of them
42:04
I think that's the whole thing we need that. I just click on Deploy and it's going to ready to be deployed
42:15
It takes a bit time, so you can see that actually take a bit time to deploy it
42:24
Just the deployment starts, so it's going to create a real-time endpoint for you
42:30
That you can see the endpoint over here. It's a start to creating. It's not complete yet
42:38
It's a start to creating the endpoint over here. As you can see, it's still not created over here
42:45
so it takes a bit of time to create it. But after it's actually created
42:49
you can use that one because we are a bit run of the time
42:53
I'm just showing you that actually after it's created, you can use it here or in any other application
43:03
So just share that one. So this is a process that you can see till create that one
43:11
but it's not just related to that. So you see that we encapsulate the pre-perd data
43:16
we can train the model, we create a package of the model, we validate the model deployment and monitor the model
43:24
So this is a process that you already created and you want to kind of encapsulate everything
43:30
so everything run together. We call it pipelines. Pipelines are a key to implement an effective machine learning
43:40
MLOPS stands for Machine Learning Operationization Solution in Azure. Steps can be arranged sequentially or parallel
43:48
For example, if you run a couple of algorithms at the same time
43:53
the data processing can be one step, but running the algorithm and creating one can be in a separate
43:59
as a parallel process. Each step can be run on a specific compute target
44:07
This is a good point of that because this actually helps you to do the scalability
44:13
You can have a couple of different virtual machines and each algorithm can be run through that
44:18
This is a power of that that actually helps you to run different virtual machines
44:25
different algorithms on different machines that you have. Pipeline can have different meanings
44:34
When I talk pipeline, for example, in a SKET Learn, pipeline is combination of data processing with training algorithm
44:41
But in the Azure DevOps, it can be a bit different. When we call it pipeline
44:46
it's not just a one process, it can be a combination of the process
44:52
For the common approach in Azure Machine Learning pipeline like run a specific Python scripts it can be using the Azure Data Factory before that to
45:05
get the data from the data source. It can be running notebook scripts on
45:13
the data breaks after we've done something in Azure ML. It can be running a U-SQL job or the other
45:20
It's not just related to what you see in the Azure ML
45:24
this can be other jobs that we have in Azure to prepare our data or deploy our data
45:30
These are the examples of the running that. For example, for the first one
45:35
we have prepared the data. We prepare the data, we have the text that shows our data preparation in a Python language
45:49
and we specify what target machine or compute machine we are going to use
45:57
For the training part, again, we have a model that is trainModel.python. That's a second one
46:03
These are the steps that actually we specify, and after that, we include them like here
46:11
As you can see here, this is a step one, that is data preparation
46:16
Step 2 was training the model on a specific compute target. Then we are going to create
46:23
a pipeline cluster on step 1 and step 2. These are the process that we have
46:29
Then we are going to run that train pipeline. These are the things that actually we may use
46:36
Let's have a look on one of the things that we have over here
46:41
quickly and see that how it actually worked. Again, there is a link I'm going to share with you guys
46:49
That's also a really interesting one. For the exam DP 100, the guideline and the learning path is a really interesting one
47:03
I want to share that one with you guys. So let me see that if I can share that one
47:12
I think that's the pipeline I'm going to use. Let me share is the same one or not
47:30
Not. No. And let me check that one. I just see the same one
47:44
Not often. Yeah, it should be on that one. Pipeline. Okay, see which one I use it here. Curator pipeline. Yep, it should be that one. Oh, yeah, here. Sorry
48:00
This is an example from there. Let me share the link to this one
48:05
It's already published into the GitHub. Let me share that one with you
48:12
That's not this one. Let me share the link so you can actually use that one
48:20
See, this one. I'm going to share it through link over here. Okay
48:45
Here is actually the pipeline that we have. I'm going through the code
48:50
I can see the question Gabriel has. As a data scientist, how much non-linear covenants or web dev do I need
48:58
So, it's actually, so we have a couple of roles. So, we have data scientists
49:05
If they are pure data scientists, so they just create the algorithm
49:09
This part we call AI engineer. So, AI engineer is people who actually knows the algorithm
49:15
but actually they should know which kind of, which server, how to use AKS or ACI through that
49:24
So yes, for that one, you need, if you work as an AI engineer, as a role, not just as a contract role, I mean that if your role in your organization is as a person that you need to deploy everything through the Azure, yes, you need to know
49:41
And I think there are links that I'm sharing with you about DP, the Microsoft Data Science
49:47
And also we have another exam that is AI 100 that also provides a really good understanding on true data
49:56
Yes, you need to know. And so ask me about which tools you are using most in your daily consultant job notebook, AutoML, and designer
50:07
To be honest, I can if I want to rank it. So notebook and then automated ML
50:14
The designer one, I mainly use it when I want to provide a fast demo to people
50:20
People who don't, for example, a prototyping. You want to provide that what you are going to process
50:26
Because for the customer, it's really easy to understand through that. So this is for me
50:31
Maybe for other people, it can be a bit different. So here, as you can see here
50:37
so these are the process that we go to create a pipeline
50:41
This is the library that we have, Azure ML.co. I don't run it because we have limited time
50:50
so I already run it before that. We connect to our Azure ML
50:55
then we prepare the data. This is an example of the data about YabaTix there
51:01
So if I want to show you under the data, we have kind of the Diabetics data sets over here
51:11
Let me close that one. So as you can see, we have some data about Diabetics that is going to get the data from our data store
51:21
So it's actually using that one to prepare the data over here
51:25
So that's not a big deal. It's going to create a pipeline steps through that
51:30
So it's defined actually a pipeline named Diabetic's pipeline to create different steps
51:40
So as we can go further, so we have the first pipeline that we are going to create is to prepare the data for the input data
51:51
So it's actually, this is the first one that we call it as a prep data
51:56
So this is the first one is going to create. As you can see here, it's going to kind of do some data transformation over here, normalize the data, kind of find the log proceed role, save the prepared data
52:11
So it's going to prepare the data and add it through the pipeline
52:16
So the other pipeline that is going to create is training data
52:21
So it going to create a training pipeline through that So again prepare the training you know that is we are going to test and train for the created data set for test and train is going to create that one identify what is the accuracy and whatever should be
52:39
So it's a training one. So here, so is actually is going to define the compute, which compute is going to use. So that's another pipeline we have
52:51
and after that is going to add that one through the pipeline that we have
52:58
So here is actually, this is our pipeline that we have that the compute has been allocated to that
53:04
So you can see the pipe kind of the thing here. So we have a step named prepare the data that get the input data
53:13
prepare the data. We have a step two that is going to train the data
53:18
If you remember, we have an argument that was the training the data
53:22
That's our second one. So step one and two has been created over here in the next step
53:29
So these are the main ones. So this can be different ones. So it's not just limited to training and testing
53:35
It can be concluded other parts over here. So that's the one
53:41
After that, so I'll just go through that. So you run that one
53:46
Let me show you here. It actually shows you the pipeline as a graphical interface over here
53:57
So you can see that this is a pipeline that I have. So my pipeline, if you remember from what we have over here, it has two steps
54:07
One was prepare the data, another one train and register the model, similar to what you see over here
54:15
These are the pipeline here. There is a graphical interface about the pipeline that you can check over here
54:23
and it shows you that. Maybe I have other one like create the scripts or getting the data from Azure
54:29
You can see all of the pipeline over here. These are the two main that you can see through
54:35
this link over here when it's running. I'll just go through the end
54:42
Sorry, just going through here. And after that, so similar to that one is actually create that one
54:51
create a model for you. And it's kind of you create the endpoint
54:58
And then you can actually schedule the pipeline. So you want all of these process happen at the same time
55:06
You can say, I can be weekly, daily only. So you can schedule that one
55:11
and that's been done. This is a very simple one, but this can be more complicated than is
55:17
combined with the other Azure component over here. This process that we have over here
55:27
so you see that we connect to the workspace, we create a training
55:31
so that was the first step we do. We create a training script
55:36
The second part was actually register the model. We provide the steps, one for the training
55:44
one for the pre-printed data, another for training. This can be the next steps can be for the deployment
55:52
like creating the web service, or it can have some pure one like
55:57
connecting to the Azure Data Factory and the other ones. Of course, you can create it each time you run the pipeline
56:05
it can be creating new version of the model. These are actually the process that can happen here
56:13
It starts from creating a folder through that, then you create a pipeline steps for training and the other
56:23
In that one, that pipeline can be sent to it, create a Docker image through that
56:31
and is run to your compute target that you have, or this can be actually mount to your compute target through that
56:38
or two possibilities, and then is in deploying as an output. So the steps that I showed to you is actually was these steps
56:48
but after that, you can create the image of that and submit it to your Docker
56:54
So these are the kind of the very simple process. If you're looking for the complex one, I can send the link later on about that
57:03
but this is a very simple process, how to create this pipeline
57:07
and how to kind of without doing each step manually, you can actually run it in that one part
57:16
I just sent some of the links through the chat over here
57:21
Any questions, any things from this? so hi Laila it was really cool again today with you thank you and yes we have some questions from
57:46
the audience so let's see the first one is from Gabriel as a data scientist how much knowledge
57:57
of Kubernetes are that development do I need to deploy machine learning inferences models in
58:02
Azure? I think I replied that one. So yes, as I mentioned, so yes, you as the AI engineering
58:12
you need to know the kind of knowing that one. I think I answered that question
58:17
Yes, we actually, we have a new question here also from Gabriel. He says, do you have any
58:23
recommendation on how to select the compute instance for running experiments to balance
58:28
between performance and price per hour? So it's kind of, so if I back to the
58:36
there's a good question, actually, there's a compute instance over here. So let me actually stop it for now
58:44
but I'm going to create a new one over here to show you
58:48
So for creating a compute instance, so when we go through the configuration settings
58:54
so it's like a virtual machine that we have. The one that I choose here
59:00
actually this is a very standard DS3 V2, it's a minimal of them
59:06
It has four core, four gigabyte RAM and 28 gigabyte storage. But it's actually, it depends on your scenario
59:16
If you're using lots of neural network or is a huge one
59:22
I recommend to actually to create one. As you can see here, I choose CPU, I didn't choose GPU
59:29
If you want to go for the huge data going and past
59:35
they want to do the prediction in a huge scale and organizational one or you have image processing and voice processing
59:43
it's definitely better to use the GPU one. Then on the CPU one, again, you can select from other
59:51
The one that I can see that is actually is, if you have for this scenario
59:58
I'm just use this one. It's just a two-core. It depends on what you need, actually
1:00:06
There is another question. Do you have any tips to prepare for DP100 Data Science Association Certificate
1:00:16
I can't say that, actually. For that one, the knowledge of AI 900 exam that we have is fundamental
1:00:25
That means that it's easy. Another one that we have is AI100 exam
1:00:30
that is mainly regarding these things about, let me show you things
1:00:39
this is AI100 exam. As you can see, the title is Design and Implementing an Azure AI Solution
1:00:47
It's mainly going to look actually about, Let me, they change it to..
1:00:56
Laila, sorry, I just, a quick comment. I just finished these exams
1:01:01
So the AI 100 is mostly cognitive services and the DP 100 is actually about data science and pipelines
1:01:08
Yes, exactly. I just want to show them the, oh, here. Oh, okay, so here
1:01:13
So, yeah, I passed it about a couple of months ago. So that's it, as you mentioned here
1:01:20
So I think if you want to know the AI thing, So yes, this is the part about the Azure ML part about the technical part over here
1:01:29
But as I mentioned, DP100 is mainly focused, if I want to show you, is mainly focused on the algorithm side
1:01:39
So to prepare for this exam, so as you can see, I just want to show you the learning paths
1:01:46
it's really important that actually you understand some of the machine learning concepts like hyperparameter tuning So some of the conceptual things that we use through that So I couldn find it here but you should have a good understanding on working with
1:02:04
the Python code about creating the pipeline and some things about hyperparameter tuning
1:02:13
Some of the things about kind of the algorithm side, about the working with the Python
1:02:19
So I totally recommend follow this learning path. Again, I can send that one through the chat
1:02:28
so you can have a link through that. And of course, for each exam
1:02:34
so besides following up everything, it's good to check some of the questions that
1:02:40
even Microsoft has some questions, sample questions, because exam is a bit different from what we do in the daily job
1:02:47
They look it in a different aspect, so it's good to have some example of doing that
1:02:53
Any other things from you, Eve? Do you have any comments on that? Not really, but only that maybe this DP100
1:03:04
only includes the basics of Azure ML workspace. So that's all, yes
1:03:11
Yeah, AI 900. You mean AI 900 one? AI fundamental. No, I am actually talking about this data scientist exam
1:03:19
that also includes the Azure ML. Yes, yeah. Okay, so because we have three exams
1:03:25
we have AI 900, that's a fundamental. We have DP 100, that is this one is more about
1:03:31
hyper parameter tuning, something like this, and the AI 100. We also have a questionnaire from Huaou who asks
1:03:43
Can you suggest the best practice on how to create the data refresh automation process So for that one is actually I can say Data factory is a good option to use that also we have azure synapse so definitely check that one so azure synapse is also
1:04:02
connected to the azure ml these days so that's a really good orchestration through them
1:04:09
okay do we have questions that we haven't answered no then i think we've actually
1:04:16
answered all the questions here yeah so i think the uh i find really good the guide now how would
1:04:26
you have you find the learning path really good how's your experience with the learning yes yes
1:04:33
i i do felt very comfortable when i went to the okay it's it's always different when you go to an
1:04:39
exam and you first start with the studying on this learning path and then possibly make some
1:04:46
practice exams or something like that but then when you actually go to the exam it's just you
1:04:52
can't fully prepare for an exam like this it's it's often these use cases are very specific and
1:04:59
and you have to make your own decisions in the end for these exams so it is good to start with
1:05:06
these lectures but it's also a good idea to to literally go and try out these tools and understand
1:05:13
and what is happening if you make any changes with your code
1:05:17
or with the designer or so on. Yes, yes, definitely. So, yeah, definitely experience, as mentioned
1:05:25
is really important to that. So, yeah, just don't stick with that because it's kind of, yeah
1:05:32
it's some part of that. So, yeah, it's good to have some practice. Okay, cool
1:05:38
So I think there's no other question or there's other questions. Can you see
1:05:42
I couldn see any other questions No more questions so far No you answered I think these questions Yeah this time I can see the comments so last time I didn see that but this time I check the comments
1:05:56
Let's unlearn. But yes, thank you, Laila. It was really cool to have you today
1:06:03
And you presented an amazing thing again, so I'm very happy. And I hope our audience also enjoyed the session today
1:06:14
Thanks. I just stop sharing. Thank you. And I think with that, we can say a big, big, big thank you to all of the contributors, all of you, all of the speakers we had in this year
1:06:34
And all the love that you gave us from the audience and from the speakers and contributors as well
1:06:41
This year was like a blast, really. We had how many sessions again, Håkon
1:06:47
This is our 21st session. 21st, yes. So it's amazing. And we can't wait to come back next year and continue this journey
1:06:56
We have amazing sessions coming up next year as well. So I hope you will come and join us to learn more about the tools and services that are available to make great data and AI solutions
1:07:09
yes so i think with that said we just want to wish everyone a merry christmas and a happy new
1:07:19
year and we hope to see you soon again in january thank you and thanks so much for having me and
1:07:26
happy new year to everyone thank you thank you bye-bye happy holidays