0:00
Hi everyone and welcome back to this Dash of .NET show
0:03
I'm really, really excited for this episode because we're going to learn something really new
0:07
which I have not seen anyone talking about recently in the community and to deliver this great session today
0:14
we have Daniel who is a Microsoft MVP and a Senior Software Engineer
0:18
We have hosted him quite a few times in last three, four years and I'm so happy to have him back
0:24
So without any further ado, let's bring back the rockstar Daniel. Hi Daniel, welcome to the Dash of .NET show
0:34
Thank you very much. I'm very happy to be back. Yeah, Daniel. I'll compliment you straight up
0:40
You look more fine than you did last year. You have changed your look
0:44
It was fun day. Yeah, that makes sense Daniel. So Daniel, what are you going to talk about today
0:52
Well, I'm very excited about this topic. I mean, I will talk about Microsoft Semantic Kernel
0:59
I will jump in a bit and explain what it is. But nevertheless, I have a chance to talk about something very
1:06
new because we just had the chance to see the final release last month
1:12
Yeah, that was really cool to have it released this month and I've seen a few people writing blogs and making videos around it
1:20
I'm so happy that you're going to talk about today. So Daniel, I'm not going to take much of your time
1:24
Please feel free to go ahead and share your screen and then we can get started
1:29
Very well. Let me share the screen. And everyone who's watching, you can find Daniel's social handles
1:35
in the video description. If you want to follow him on Twitter, on LinkedIn, you can get all the details from there and reach out to him just in case
1:42
if you need any help. I'm sure Daniel is very helpful. I mean, that's what he does, right
1:48
He's a community guy. I see your screen. Yeah, that's your presentation
1:54
I'm going to add your presentation to the stream now. Everybody else can see your screen and it's all you now
2:00
Fantastic. All good? OK, so firstly, why Semantic Kernel
2:11
We had the chance to see some other frameworks. For example, there is a library for consuming OpenAI LLM models
2:24
but that was more like a client. And now we are going to talk about an orchestrator
2:34
Semantic Kernel is more than a client. It's not only sending prompts back and forth
2:40
but it's about using the power of AI to orchestrate some more complex scenario
2:50
So this is Semantic Kernel. And I will let you know in a bit what it is
3:00
So it's an open source. That means you can contribute. You can influence what is developing there
3:07
And I have to tell you that in the last month, there were a lot of changes
3:14
because the last month was just released. The final release, 1.0.1, took place a few weeks ago
3:24
And until then, for about half a year, there were a lot of changes
3:30
And why was that? It was because OpenAI had a lot of changes
3:36
They added function calling. They added assistance, a lot of features. And these had a direct impact on the development of this library
3:51
But maybe the most important part, it's that this library is for Toddler developers
4:00
We had a chance maybe to see a link chain, but that was limited to Python and JavaScript
4:10
if I'm not wrong. But now with Semantic Kernel, C Sharp developers, along with Python and along with Java
4:19
had the chance to use this power of large language models. This library is able to consume services directly from OpenAI along with Azure OpenAI
4:36
but not limited to them because it's able to consume models from HuggingFace
4:43
And maybe, again, it's not very... Actually, I didn't check recently, but in the past, I had the chance to see that this library
4:58
being abstract, is able to consume on-premises models like Lama family models
5:06
And that's fantastic because you have the chance to run everything on-premises in a silo
5:14
I don't know, in a sandbox and play with it. So, as I said before, this is an orchestrator
5:23
And how is this completed? It's by using some containers for semantic functions and native functions
5:36
But what are these? We will see in a bit. Firstly, I would like to tell you that in order to build advanced prompts, we have here a syntax
5:54
And this syntax is able to work with some placeholders for using some variables or even
6:03
functions with arguments. And that's very good because we don't want our prompts to be static text
6:13
We are happy to make them dynamic, injecting some data in these prompts on the fly
6:24
So, again, what are these plugins? Containers, where we can put and we can group together functions which are for the same purpose
6:39
for example. We can even mix native functions with prompt functions. But let's say we have a group of functions which are processing text
6:53
Yeah, we would like to put them together like, I don't know, string functions or strings plugin
6:59
whatever the name you like. And in the same time, we can use these plugins to keep some native functions
7:11
Let me get into some details. What are each of them? Semantic functions are text
7:20
Text in, text out. It's a concept we had a chance to hear before
7:25
But it's like natural language functions, which are, of course, what large language models are using
7:37
But we are not limited to that because we know these large language models were trained
7:45
using some data which are not very current. So, in order to work with some real-time data, we need to do some, for example, some search
8:02
on Internet. And for that, we need some native functions to call some endpoints
8:09
Or let me tell you about some other scenario. For example, you have an API
8:14
And this API has some specific endpoints. We would like to be able to call them using HTTP client, for example
8:25
But there are a lot of out-of-the-box native libraries, plugins, sorry, native plugins
8:36
included in Semantic Kernel. And we can develop our own, which I will demo in a bit in this talk
8:47
So, as soon as we have these plugins, we can use them or we can export them in ChargPT
8:58
if we like. Or Bing chat or Microsoft 365. Let me show you a diagram
9:07
So, we have different plugins. And we can mix them together and solve some use cases
9:22
We are going to keep these plugins in our kernel. Kernel is a concept of this Semantic Kernel, which is the glue where we put together
9:37
plugins and connectors and some other settings. And, of course, we are able to call these functions
9:52
Of course, we can create the plugins in a programmatic way. But we are not limited to this
10:00
We can have some dedicated way of working with these plugins by using some special files
10:09
And these are escape prompt.txt, which is, in fact, a JSON file, which keeps
10:18
information about the prompt and the variables used for this prompt. For example, when we try to get a response from a large language model, if we want to be
10:39
creative because we are, I don't know, composing a poem or something or a story, we want this
10:46
LLM, large language model, to be very creative. Therefore, we would like to put the temperature higher
10:55
And if we want something which is precise, like generating Python or C-sharp code
11:04
then we don't need creativity, we need preciseness. So then we will put temperature to zero
11:12
But there are some other parameters and it's interesting to play with
11:16
And as I said before, we can put some information about the input variables
11:25
By default, we have the name input for input variable, but we can have more and we can
11:34
customize the names. So in the right side, we have the prompt
11:42
As we can see, it's an English language, but we are not limited to English
11:47
You can try playing with your native language. Why not? And inside what is not so common are these double accolades, which is our dynamic piece
12:08
of our prompt. For example, the input variables, or we can have a function we are calling
12:17
There is another one here, it's a comment, right? So we have here a prompt which will replace this text with some injected data from comments
12:31
and some injected data from input. Of course, I have used some other tricks in order to emphasize the most important parts
12:41
of this input, but it's not the case to build here ideal prompt, but it's important to make
12:49
it efficient. So having these semantic functions or prompt functions, let's talk about a little bit about
13:00
the native functions. And as we can see, it's a C sharp code
13:07
Similarly, we can have Python or JavaScript. These three languages are not developed equally
13:20
So C sharp and Python are more advanced. I mean, Java may lack some features which are not there yet
13:30
These three languages are not developed equally. So C sharp and Python are more advanced
13:35
I mean, Java may lack some features which are not there yet
13:41
This can be checked by visiting the official repository. And there you can find even a matrix with which features is available for each language
13:59
So let's take a look on this C sharp class. The native functions are easy to identify because they are prefixed by kernel function
14:16
attribute and the description. And the description is very important because the name, actually the text in the description
14:26
has to be concise to not leave space for interpretations. So if you have such class with the kernel functions, keep there only the minimum of
14:45
the functions you are planning to use with your prompt. And be precise and unique with your description because the large language model will try
14:57
to mix and match in order to identify which function is good for your prompt
15:09
So now I think I have to go a little bit in details
15:15
How are these used? For example, I have a robot car and this robot car has a limited functionality
15:31
For example, it can move forward, backward, turn left, turn right and stop
15:40
Basic movements, right? But maybe I want something more complex. For example, walk like a ballerina or do a full turn, a complete turn, 360 degrees
16:04
And these are not very obvious how we'll map with the basic capabilities of our API
16:15
So something, and this is large language model, will have to figure out
16:22
and break down this complex action into basic pieces. For example, if we ask our prompt to do moonwalk dancing, and Michael Jackson walking backwards
16:42
Well, I'm expecting large language model to break down this moonwalk dancing into some
16:52
steps which are going forward more or less. And similarly, for ballerina, I'm expecting to break down this dance in some
17:06
some basic movements like forward, turn left, turn right, you know, capturing a little bit of this dance
17:16
Okay, so I think you got the point. I talk to my API telling a complex move and the large language model
17:28
will break down this complex move into pieces which are very well known by my API
17:42
And let's see how we can do this. Because in order to break down this complex move in pieces, we have to do some tricks
17:55
We have to use some advanced techniques which Semantic Kernel is providing
18:04
And I will talk about function calling and I will talk about planners
18:10
Planners are more, more, it's, it's, they were introduced earlier and they were responsible
18:22
for figuring out complex actions in smaller basic actions or steps. But recently, OpenAI introduced the function calling and these functions are
18:43
some special functions which LLM understands in the large language model. Let me tell you an example
18:54
You can call the large language model with the prompt which asks for the weather in your city
19:07
right now. And of course, the larger models were not trained by current data
19:17
So not a chance to know what's the weather now for your city
19:23
So you can add to your prompt some functions which are able to interrogate some API
19:36
for the weather forecast thing and for the daytime. And having these special functions which are like APIs
19:52
you can send these functions to the large language model. The larger language model will try to identify your needs and it will realize
20:11
okay, in order to, to respond better for this prompt, I have to send back a
20:19
a response to the caller saying I need to call first these functions which were
20:29
sent with along with the prompt. So as I said, the response is coming back with some, this time with some requests
20:38
for these special functions which are running on the client side. They are getting some response, for example, the weather and the daytime
20:51
And then they are sending back, look, look, it's a chat with the large language model
20:56
So the responses are sending back to the large language models. And this cycle is complete when there are no more functions to call because the function
21:11
can call another function. So in the end, this will be complete
21:19
And then the prompt will be augmented with the necessary information. And the larger language model will respond with what you are expecting
21:35
Like what is the weather in your city? And for this daytime in a, in a human format
21:49
Yeah, not like JSON, which is the usual response from an API
21:56
Okay, now there are two flavors on function calling. On function calling, for example, you can call these functions by setting the auto invoke on true
22:11
And this is making our code very simple. As you can see below, it's just a few lines of code
22:21
And let me explain a little. So at some point, we inform our kernel, we are going to use motor plugin here
22:33
and then we are invoking this function. And lastly, and we are iterating through the responses, as I said, is a, is a
22:46
is a chat with large language models until all the functions are solved
22:52
For example, we may expect here to, to have more steps if we break down the ballerina dance
23:03
For example, go left, go right, forward, backward, whatever, and stop, whatever
23:09
So we are expecting here to, to see these steps, which are figured out by this function
23:19
inside the large language model. And of course, having these steps, now the, the, we can do something
23:31
For example, executing these steps on our side. Maybe you have a robot, like I call this presentation, because I started from this idea
23:40
to, to try to talk with the robot API. Yeah, we have, in the end, we have this car