0:03
hello everyone and welcome back to the
0:05
hello everyone and welcome back to the
0:05
hello everyone and welcome back to the cloud show with another episode and
0:07
cloud show with another episode and
0:07
cloud show with another episode and another great topic this is a topic that
0:10
another great topic this is a topic that
0:10
another great topic this is a topic that is top of mind for a lot of people right
0:13
is top of mind for a lot of people right
0:13
is top of mind for a lot of people right now it's about Ai and it's about Lang
0:16
now it's about Ai and it's about Lang
0:16
now it's about Ai and it's about Lang large language models and small ones by
0:18
large language models and small ones by
0:18
large language models and small ones by the way because we're going to talk
0:19
the way because we're going to talk
0:19
the way because we're going to talk about how to relate to am I going to use
0:23
about how to relate to am I going to use
0:23
about how to relate to am I going to use a large language model or do I only need
0:25
a large language model or do I only need
0:25
a large language model or do I only need a small one I don't know but I know who
0:28
a small one I don't know but I know who
0:28
a small one I don't know but I know who does that's Christian and he's on the
0:30
does that's Christian and he's on the
0:31
does that's Christian and he's on the cloud show
0:42
tonight hello Christian my friend hey
0:44
tonight hello Christian my friend hey
0:45
tonight hello Christian my friend hey magos what a nice picture of you in the
0:47
magos what a nice picture of you in the
0:47
magos what a nice picture of you in the intro I know right it's my show but you
0:51
intro I know right it's my show but you
0:51
intro I know right it's my show but you you sir you are the star so let's start
0:53
you sir you are the star so let's start
0:53
you sir you are the star so let's start there let's start with you let's start
0:55
there let's start with you let's start
0:55
there let's start with you let's start by say letting you tell the audience who
0:57
by say letting you tell the audience who
0:57
by say letting you tell the audience who is
0:58
is Christian well well so first thank you
1:02
Christian well well so first thank you
1:02
Christian well well so first thank you again for inviting me and for having me
1:04
again for inviting me and for having me
1:04
again for inviting me and for having me it has been a long time that we try to
1:06
it has been a long time that we try to
1:06
it has been a long time that we try to set this up you
1:07
set this up you know for months I guess schedules
1:09
know for months I guess schedules
1:10
know for months I guess schedules schedules schedules schedules so I'm
1:12
schedules schedules schedules so I'm
1:12
schedules schedules schedules so I'm Christian from think tecture I am from
1:14
Christian from think tecture I am from
1:14
Christian from think tecture I am from Germany obviously because I was very
1:16
Germany obviously because I was very
1:16
Germany obviously because I was very very early uh before the show yes you
1:18
very early uh before the show yes you
1:18
very early uh before the show yes you know that you were way on time that's
1:21
know that you were way on time that's
1:21
know that you were way on time that's normal time for German
1:23
normal time for German
1:23
normal time for German people um I have been in the business uh
1:26
people um I have been in the business uh
1:26
people um I have been in the business uh of software technology and software
1:28
of software technology and software
1:28
of software technology and software Consulting since 199 6 and I have always
1:31
Consulting since 199 6 and I have always
1:31
Consulting since 199 6 and I have always been a distributed applications and
1:33
been a distributed applications and
1:33
been a distributed applications and distributed systems guy y I have been
1:36
distributed systems guy y I have been
1:36
distributed systems guy y I have been working with Azure uh right away after
1:40
working with Azure uh right away after
1:40
working with Azure uh right away after um the conference in
1:42
um the conference in 2008 um where they were announcing Azure
1:47
2008 um where they were announcing Azure
1:47
2008 um where they were announcing Azure showing it and announcing it and windows
1:49
showing it and announcing it and windows
1:49
showing it and announcing it and windows Azure Windows Azure and we could start
1:52
Azure Windows Azure and we could start
1:52
Azure Windows Azure and we could start playing with it yeah so I've always had
1:55
playing with it yeah so I've always had
1:55
playing with it yeah so I've always had my head in the cloud but always um tried
1:59
my head in the cloud but always um tried
2:00
my head in the cloud but always um tried to look at the endtoend aspects of
2:02
to look at the endtoend aspects of
2:02
to look at the endtoend aspects of software Solutions so I'm looking into
2:05
software Solutions so I'm looking into
2:05
software Solutions so I'm looking into client stuff I'm looking into server
2:07
client stuff I'm looking into server
2:07
client stuff I'm looking into server stuff I'm looking into on Prem stuff I'm
2:09
stuff I'm looking into on Prem stuff I'm
2:09
stuff I'm looking into on Prem stuff I'm looking into the cloud I am having a
2:11
looking into the cloud I am having a
2:11
looking into the cloud I am having a look at security and so on and so forth
2:14
look at security and so on and so forth
2:14
look at security and so on and so forth and that brings me to yeah some some
2:17
and that brings me to yeah some some
2:17
and that brings me to yeah some some great insights um on the technology side
2:20
great insights um on the technology side
2:20
great insights um on the technology side because I call myself uh a technology
2:23
because I call myself uh a technology
2:23
because I call myself uh a technology Catalyst which means I'm always happy
2:27
Catalyst which means I'm always happy
2:27
Catalyst which means I'm always happy and I have the honor of looking into new
2:29
and I have the honor of looking into new
2:29
and I have the honor of looking into new stuff and newer stuff and even new stuff
2:31
stuff and newer stuff and even new stuff
2:31
stuff and newer stuff and even new stuff than the new stuff and try to find out
2:34
than the new stuff and try to find out
2:34
than the new stuff and try to find out what makes sense what doesn't make sense
2:36
what makes sense what doesn't make sense
2:36
what makes sense what doesn't make sense what could be useful for one or the
2:40
what could be useful for one or the
2:40
what could be useful for one or the other of our customers because our
2:42
other of our customers because our
2:42
other of our customers because our customers are always software developers
2:45
customers are always software developers
2:45
customers are always software developers on the other end so we are working with
2:48
on the other end so we are working with
2:48
on the other end so we are working with Enterprise developers and we are working
2:50
Enterprise developers and we are working
2:50
Enterprise developers and we are working with isv developers okay brilliant so
2:53
with isv developers okay brilliant so
2:53
with isv developers okay brilliant so let's just like zoom in on our topic as
2:57
let's just like zoom in on our topic as
2:57
let's just like zoom in on our topic as we were saying before the show here we
2:58
we were saying before the show here we
2:58
we were saying before the show here we want to talk about large and small
3:00
want to talk about large and small
3:00
want to talk about large and small language models I love that so recently
3:03
language models I love that so recently
3:03
language models I love that so recently the world had this AI moment AI for
3:06
the world had this AI moment AI for
3:06
the world had this AI moment AI for everything everything has to have ai and
3:08
everything everything has to have ai and
3:08
everything everything has to have ai and everything has to have a co-pilot but if
3:10
everything has to have a co-pilot but if
3:10
everything has to have a co-pilot but if we're talking oh yeah co-pilot yeah
3:13
we're talking oh yeah co-pilot yeah
3:13
we're talking oh yeah co-pilot yeah we're talking about language models now
3:15
we're talking about language models now
3:15
we're talking about language models now um your customers are trying to make
3:17
um your customers are trying to make
3:17
um your customers are trying to make sense of what to do with these language
3:19
sense of what to do with these language
3:19
sense of what to do with these language models am I understanding that correctly
3:22
models am I understanding that correctly
3:22
models am I understanding that correctly yeah so
3:25
yeah so um what our customers are always looking
3:28
um what our customers are always looking
3:28
um what our customers are always looking for is some some kind of a technology or
3:33
for is some some kind of a technology or
3:33
for is some some kind of a technology or a set of technologies that sets them
3:36
a set of technologies that sets them
3:36
a set of technologies that sets them apart from the others yeah because they
3:39
apart from the others yeah because they
3:39
apart from the others yeah because they are developing software and on the other
3:42
are developing software and on the other
3:42
are developing software and on the other side yes and on the other side enables
3:46
side yes and on the other side enables
3:46
side yes and on the other side enables their end customers like you and me
3:49
their end customers like you and me
3:49
their end customers like you and me using certain products or Services uh to
3:52
using certain products or Services uh to
3:52
using certain products or Services uh to have a better user experience and to
3:54
have a better user experience and to
3:54
have a better user experience and to enable new use cases and of course we
3:57
enable new use cases and of course we
3:57
enable new use cases and of course we had this moment uh something around 2010
4:01
had this moment uh something around 2010
4:01
had this moment uh something around 2010 11-ish when we were moving away from
4:04
11-ish when we were moving away from
4:04
11-ish when we were moving away from Pure desktop applications yeah uh to um
4:08
Pure desktop applications yeah uh to um
4:08
Pure desktop applications yeah uh to um mobile and web and desktop crossplatform
4:12
mobile and web and desktop crossplatform
4:12
mobile and web and desktop crossplatform as Solutions right where we had the
4:14
as Solutions right where we had the
4:14
as Solutions right where we had the Advent of single page applications doing
4:17
Advent of single page applications doing
4:17
Advent of single page applications doing web based apps with react and with
4:20
web based apps with react and with
4:20
web based apps with react and with angular and so on and so forth um so
4:24
angular and so on and so forth um so
4:24
angular and so on and so forth um so that was one of those moments where the
4:26
that was one of those moments where the
4:26
that was one of those moments where the isvs and the Enterprise developers had
4:29
isvs and the Enterprise developers had
4:29
isvs and the Enterprise developers had the AHA
4:30
the AHA moment of course they also had that aha
4:33
moment of course they also had that aha
4:33
moment of course they also had that aha moment when they saw what the the cloud
4:36
moment when they saw what the the cloud
4:36
moment when they saw what the the cloud could do of course yes of course like
4:39
could do of course yes of course like
4:39
could do of course yes of course like the idea of serverless isn't that just
4:41
the idea of serverless isn't that just
4:41
the idea of serverless isn't that just like it's brilliant I'm still working
4:44
like it's brilliant I'm still working
4:44
like it's brilliant I'm still working with with uh I'm working with a lot of
4:46
with with uh I'm working with a lot of
4:46
with with uh I'm working with a lot of public sector right now and they have
4:48
public sector right now and they have
4:48
public sector right now and they have known what it is but they have not been
4:50
known what it is but they have not been
4:50
known what it is but they have not been allowed to like use it at all but now
4:53
allowed to like use it at all but now
4:53
allowed to like use it at all but now and so they're coming into this now like
4:55
and so they're coming into this now like
4:55
and so they're coming into this now like I don't know 10 10 plus years into the
4:58
I don't know 10 10 plus years into the
4:58
I don't know 10 10 plus years into the game right yeah exactly it's interesting
5:01
game right yeah exactly it's interesting
5:01
game right yeah exactly it's interesting so we had several of those moments right
5:03
so we had several of those moments right
5:03
so we had several of those moments right yeah but in the past let's say I don't
5:06
yeah but in the past let's say I don't
5:06
yeah but in the past let's say I don't know 10 years 12 years maybe even 14
5:10
know 10 years 12 years maybe even 14
5:10
know 10 years 12 years maybe even 14 years there hasn't been a major thing
5:12
years there hasn't been a major thing
5:12
years there hasn't been a major thing well there was
5:15
well there was blockchain block what block I don't know
5:19
blockchain block what block I don't know
5:19
blockchain block what block I don't know yeah so there was one or the other
5:21
yeah so there was one or the other
5:21
yeah so there was one or the other technology that we thought could be a
5:24
technology that we thought could be a
5:24
technology that we thought could be a major shift but then it was a major pain
5:27
major shift but then it was a major pain
5:27
major shift but then it was a major pain so in the past 10 to 12 to 14 years we
5:30
so in the past 10 to 12 to 14 years we
5:30
so in the past 10 to 12 to 14 years we just had to go on with the things we had
5:33
just had to go on with the things we had
5:33
just had to go on with the things we had but then November 2022 we had that very
5:36
but then November 2022 we had that very
5:36
but then November 2022 we had that very prominent chbt moment and after that
5:40
prominent chbt moment and after that
5:40
prominent chbt moment and after that especially open AI as a company uh was
5:44
especially open AI as a company uh was
5:44
especially open AI as a company uh was pleasing us with releasing access to not
5:47
pleasing us with releasing access to not
5:47
pleasing us with releasing access to not just that web application which we could
5:49
just that web application which we could
5:49
just that web application which we could use to chat but with the models behind
5:54
use to chat but with the models behind
5:54
use to chat but with the models behind that and those are the large language
5:58
that and those are the large language
5:58
that and those are the large language models they are very very large because
6:01
models they are very very large because
6:01
models they are very very large because they have been trained on a very very
6:02
they have been trained on a very very
6:02
they have been trained on a very very large data set data sets and of course
6:07
large data set data sets and of course
6:07
large data set data sets and of course they are very very large neural networks
6:09
they are very very large neural networks
6:09
they are very very large neural networks in order to provide us power that they
6:12
in order to provide us power that they
6:12
in order to provide us power that they actually can provide us so so if I sit
6:15
actually can provide us so so if I sit
6:15
actually can provide us so so if I sit down with a with a browser and and I I
6:18
down with a with a browser and and I I
6:18
down with a with a browser and and I I open up U chat GTP there and I type
6:21
open up U chat GTP there and I type
6:21
open up U chat GTP there and I type things and I I talk to it you know it
6:24
things and I I talk to it you know it
6:24
things and I I talk to it you know it appears as if I'm talking to it h then
6:27
appears as if I'm talking to it h then
6:27
appears as if I'm talking to it h then then I'm talking to a very very very
6:29
then I'm talking to a very very very
6:29
then I'm talking to a very very very large language model the largest ones
6:32
large language model the largest ones
6:32
large language model the largest ones right yes so in the first instance you
6:34
right yes so in the first instance you
6:34
right yes so in the first instance you are talking to a a very sophisticated
6:37
are talking to a a very sophisticated
6:37
are talking to a a very sophisticated web application yeah which does all the
6:39
web application yeah which does all the
6:39
web application yeah which does all the guy and the chat history and the state
6:42
guy and the chat history and the state
6:42
guy and the chat history and the state management and then in the background
6:45
management and then in the background
6:45
management and then in the background there is maybe one of the largest large
6:48
there is maybe one of the largest large
6:48
there is maybe one of the largest large language models in the world most likely
6:50
language models in the world most likely
6:50
language models in the world most likely gbt 3.5 turbo or gbt 4 or gb4 turo yeah
6:55
gbt 3.5 turbo or gbt 4 or gb4 turo yeah
6:55
gbt 3.5 turbo or gbt 4 or gb4 turo yeah all of them have very very interesting
6:57
all of them have very very interesting
6:57
all of them have very very interesting names that you can crazy names actually
7:00
names that you can crazy names actually
7:00
names that you can crazy names actually and they get better when we will be
7:02
and they get better when we will be
7:02
and they get better when we will be talking about the small language models
7:04
talking about the small language models
7:04
talking about the small language models yeah so tell us what so okay so you
7:07
yeah so tell us what so okay so you
7:07
yeah so tell us what so okay so you won't be using a large language model
7:09
won't be using a large language model
7:09
won't be using a large language model for everything you might need to use a
7:12
for everything you might need to use a
7:12
for everything you might need to use a smaller language model for some other
7:14
smaller language model for some other
7:14
smaller language model for some other scenarios why is that you could you
7:16
scenarios why is that you could you
7:16
scenarios why is that you could you could so
7:18
could so um the large language models have been
7:21
um the large language models have been
7:21
um the large language models have been made prominent by let's call them not so
7:25
made prominent by let's call them not so
7:25
made prominent by let's call them not so open- Source companies okay there is
7:28
open- Source companies okay there is
7:28
open- Source companies okay there is open in the name of open AI but maybe
7:30
open in the name of open AI but maybe
7:30
open in the name of open AI but maybe they are not so open because they did
7:32
they are not so open because they did
7:32
they are not so open because they did not open the the data sets they did not
7:35
not open the the data sets they did not
7:35
not open the the data sets they did not open the code and so on and so forth uh
7:38
open the code and so on and so forth uh
7:38
open the code and so on and so forth uh which is fine because it's their
7:40
which is fine because it's their
7:40
which is fine because it's their business model I don't judge them but on
7:45
business model I don't judge them but on
7:45
business model I don't judge them but on the other side there has been a very
7:46
the other side there has been a very
7:46
the other side there has been a very very large open source Community which
7:49
very large open source Community which
7:49
very large open source Community which has always been very very active in the
7:52
has always been very very active in the
7:52
has always been very very active in the background and which was maybe somehow
7:55
background and which was maybe somehow
7:55
background and which was maybe somehow surprised by that gbt um moment but then
7:59
surprised by that gbt um moment but then
7:59
surprised by that gbt um moment but then suddenly they uh they stopped hiding and
8:03
suddenly they uh they stopped hiding and
8:03
suddenly they uh they stopped hiding and they came out of their caves and now we
8:06
they came out of their caves and now we
8:06
they came out of their caves and now we have a very very um diverse ecosystem
8:09
have a very very um diverse ecosystem
8:09
have a very very um diverse ecosystem and Community with large language models
8:11
and Community with large language models
8:11
and Community with large language models being hosted by the big ones large
8:14
being hosted by the big ones large
8:14
being hosted by the big ones large language models being hosted by an open
8:16
language models being hosted by an open
8:16
language models being hosted by an open source Community but also smaller models
8:20
source Community but also smaller models
8:20
source Community but also smaller models which are a trained on public data you
8:23
which are a trained on public data you
8:23
which are a trained on public data you can actually see the data sets which
8:26
can actually see the data sets which
8:26
can actually see the data sets which they have been trained on and uh see uh
8:29
they have been trained on and uh see uh
8:29
they have been trained on and uh see uh b b a b uh they they are much smaller in
8:35
b b a b uh they they are much smaller in
8:35
b b a b uh they they are much smaller in terms of the parameters of the neural
8:38
terms of the parameters of the neural
8:38
terms of the parameters of the neural network which means uh they are not as
8:41
network which means uh they are not as
8:41
network which means uh they are not as powerful as the large language models
8:43
powerful as the large language models
8:43
powerful as the large language models but they can be run in a much more
8:46
but they can be run in a much more
8:46
but they can be run in a much more economic way and they maybe can even be
8:50
economic way and they maybe can even be
8:50
economic way and they maybe can even be run on my system maybe run on my laptop
8:54
run on my system maybe run on my laptop
8:54
run on my system maybe run on my laptop maybe be run on the server of my
8:56
maybe be run on the server of my
8:56
maybe be run on the server of my customer or I actually here in that
9:00
customer or I actually here in that
9:00
customer or I actually here in that corner R I have a Raspberry Pi five
9:03
corner R I have a Raspberry Pi five
9:03
corner R I have a Raspberry Pi five running a small language
9:06
running a small language
9:06
running a small language model that's interesting because that
9:08
model that's interesting because that
9:08
model that's interesting because that that opens up to a whole other set of
9:11
that opens up to a whole other set of
9:11
that opens up to a whole other set of applications exactly and I tell you if
9:15
applications exactly and I tell you if
9:15
applications exactly and I tell you if you have been following the press a
9:16
you have been following the press a
9:16
you have been following the press a little bit uh maybe in mid of June there
9:18
little bit uh maybe in mid of June there
9:18
little bit uh maybe in mid of June there is the
9:19
is the WWDC the worldwide developer conference
9:22
WWDC the worldwide developer conference
9:22
WWDC the worldwide developer conference from Apple and I think they will be
9:25
from Apple and I think they will be
9:25
from Apple and I think they will be finally launching a small language model
9:27
finally launching a small language model
9:27
finally launching a small language model for iOS because it makes sense
9:29
for iOS because it makes sense
9:29
for iOS because it makes sense right that yeah what is a language model
9:33
right that yeah what is a language model
9:33
right that yeah what is a language model a language model is a program a service
9:37
a language model is a program a service
9:37
a language model is a program a service an application that understands natural
9:41
an application that understands natural
9:41
an application that understands natural language yeah e either written or spoken
9:46
language yeah e either written or spoken
9:46
language yeah e either written or spoken yeah and that enables a bunch of new
9:49
yeah and that enables a bunch of new
9:49
yeah and that enables a bunch of new ideas of how we can work with software
9:53
ideas of how we can work with software
9:53
ideas of how we can work with software of how we can enable new use cases into
9:55
of how we can enable new use cases into
9:55
of how we can enable new use cases into our existing or into our new software
9:58
our existing or into our new software
9:58
our existing or into our new software and maybe we don't want to be
10:01
and maybe we don't want to be
10:01
and maybe we don't want to be um um we don't want to
10:05
um um we don't want to
10:05
um um we don't want to be always have to talk to those servers
10:09
be always have to talk to those servers
10:09
be always have to talk to those servers and those Cloud systems those Clos Cloud
10:12
and those Cloud systems those Clos Cloud
10:12
and those Cloud systems those Clos Cloud systems but maybe we want to have some
10:14
systems but maybe we want to have some
10:14
systems but maybe we want to have some kind of a freedom a choice where we can
10:17
kind of a freedom a choice where we can
10:17
kind of a freedom a choice where we can pick and choose the right model either
10:21
pick and choose the right model either
10:21
pick and choose the right model either large or small and and and and the the
10:23
large or small and and and and the the
10:23
large or small and and and and the the data knowing what data it has been
10:26
data knowing what data it has been
10:27
data knowing what data it has been trained on is another valuable thing
10:30
trained on is another valuable thing
10:30
trained on is another valuable thing because if you if you have a large
10:32
because if you if you have a large
10:32
because if you if you have a large language model but you have no idea what
10:34
language model but you have no idea what
10:34
language model but you have no idea what data actually trained it or it was
10:37
data actually trained it or it was
10:37
data actually trained it or it was trained on what data you don't know what
10:39
trained on what data you don't know what
10:39
trained on what data you don't know what data that that is it can be data that is
10:42
data that that is it can be data that is
10:42
data that that is it can be data that is maybe um heavily biased against
10:45
maybe um heavily biased against
10:45
maybe um heavily biased against something it can be data that um doesn't
10:48
something it can be data that um doesn't
10:48
something it can be data that um doesn't know anything about the thing that is
10:49
know anything about the thing that is
10:49
know anything about the thing that is your business yeah how do you know you
10:51
your business yeah how do you know you
10:51
your business yeah how do you know you don't know yeah yeah yeah exactly and
10:56
don't know yeah yeah yeah exactly and
10:56
don't know yeah yeah yeah exactly and um for me and for the for the use cases
10:59
um for me and for the for the use cases
10:59
um for me and for the for the use cases it's actually not about the about the
11:02
it's actually not about the about the
11:02
it's actually not about the about the the world knowledge of those uh models
11:05
the world knowledge of those uh models
11:05
the world knowledge of those uh models but about the quality of understanding
11:07
but about the quality of understanding
11:08
but about the quality of understanding language and giving me
11:10
language and giving me
11:10
language and giving me back data that I can work with in my
11:14
back data that I can work with in my
11:14
back data that I can work with in my applications so there are several use
11:17
applications so there are several use
11:17
applications so there are several use cases where you can use a large language
11:19
cases where you can use a large language
11:19
cases where you can use a large language model of course you can use it and do
11:21
model of course you can use it and do
11:21
model of course you can use it and do role playing a lot of people are doing
11:23
role playing a lot of people are doing
11:23
role playing a lot of people are doing this right you can let it write a poem
11:27
this right you can let it write a poem
11:27
this right you can let it write a poem and let it write you know all those uh
11:30
and let it write you know all those uh
11:30
and let it write you know all those uh things but I see various use cases one
11:33
things but I see various use cases one
11:33
things but I see various use cases one of the most prominents I guess is uh
11:36
of the most prominents I guess is uh
11:36
of the most prominents I guess is uh chat with your
11:37
chat with your data talk to your data right so what
11:40
data talk to your data right so what
11:40
data talk to your data right so what others call co-pilots I guess guess yeah
11:44
others call co-pilots I guess guess yeah
11:44
others call co-pilots I guess guess yeah okay a part of the co-pilot is also talk
11:46
okay a part of the co-pilot is also talk
11:47
okay a part of the co-pilot is also talk to your data yeah um but I mean the
11:50
to your data yeah um but I mean the
11:50
to your data yeah um but I mean the GitHub co-pilot for example right it's
11:52
GitHub co-pilot for example right it's
11:52
GitHub co-pilot for example right it's trained on all the data that is
11:54
trained on all the data that is
11:54
trained on all the data that is available on G exactly ex all the code
11:56
available on G exactly ex all the code
11:56
available on G exactly ex all the code has ever written and and been checked
11:58
has ever written and and been checked
11:58
has ever written and and been checked into any of the repost and all the you
11:59
into any of the repost and all the you
11:59
into any of the repost and all the you know everything in there it understands
12:02
know everything in there it understands
12:02
know everything in there it understands that domain that's that's what it knows
12:04
that domain that's that's what it knows
12:04
that domain that's that's what it knows that's why it can suggest to you what's
12:07
that's why it can suggest to you what's
12:07
that's why it can suggest to you what's the next code statement I should be
12:08
the next code statement I should be
12:08
the next code statement I should be writing because it it knows that with a
12:11
writing because it it knows that with a
12:11
writing because it it knows that with a reasonable degree of certainty right
12:13
reasonable degree of certainty right
12:13
reasonable degree of certainty right yeah yeah but with
12:15
yeah yeah but with a large or smaller language model which
12:18
a large or smaller language model which
12:18
a large or smaller language model which you don't have to train or even to
12:21
you don't have to train or even to
12:21
you don't have to train or even to fine-tune you can use an approach called
12:24
fine-tune you can use an approach called
12:24
fine-tune you can use an approach called rack it's retrieval augmented generation
12:27
rack it's retrieval augmented generation
12:27
rack it's retrieval augmented generation where you first do a dat
12:29
where you first do a dat
12:29
where you first do a dat ingestion Pipeline and then you put it
12:32
ingestion Pipeline and then you put it
12:32
ingestion Pipeline and then you put it into a special database called a vector
12:34
into a special database called a vector
12:34
into a special database called a vector database and then you do a semantic
12:36
database and then you do a semantic
12:36
database and then you do a semantic search on that part of the system and
12:39
search on that part of the system and
12:39
search on that part of the system and then you get back results and then you
12:41
then you get back results and then you
12:41
then you get back results and then you take the top three top four top five
12:44
take the top three top four top five
12:44
take the top three top four top five results put it into the call to the
12:46
results put it into the call to the
12:46
results put it into the call to the large language model and the large
12:47
large language model and the large
12:47
large language model and the large language model gives you gives you a
12:49
language model gives you gives you a
12:49
language model gives you gives you a nice human like answer so that's one
12:53
nice human like answer so that's one
12:53
nice human like answer so that's one yeah exactly it's pretty cool that we
12:55
yeah exactly it's pretty cool that we
12:55
yeah exactly it's pretty cool that we can do this now though uh that that what
12:58
can do this now though uh that that what
12:58
can do this now though uh that that what you're saying is uh companies would
13:00
you're saying is uh companies would
13:00
you're saying is uh companies would would like to talk to their data so they
13:03
would like to talk to their data so they
13:03
would like to talk to their data so they have they have a lot of data but they
13:04
have they have a lot of data but they
13:04
have they have a lot of data but they wouldn't want to like give that data to
13:06
wouldn't want to like give that data to
13:06
wouldn't want to like give that data to the internet and let it be you know
13:09
the internet and let it be you know
13:09
the internet and let it be you know consumed into the large language model
13:11
consumed into the large language model
13:11
consumed into the large language model maybe that's business data right yeah so
13:15
maybe that's business data right yeah so
13:15
maybe that's business data right yeah so we have customers who are fine with that
13:17
we have customers who are fine with that
13:17
we have customers who are fine with that because as you know you can uh deploy
13:19
because as you know you can uh deploy
13:19
because as you know you can uh deploy gbt 4 Turbo in a data center around the
13:23
gbt 4 Turbo in a data center around the
13:23
gbt 4 Turbo in a data center around the corner with Asher right so that's fine
13:26
corner with Asher right so that's fine
13:26
corner with Asher right so that's fine uh and then you have an Enterprise
13:28
uh and then you have an Enterprise
13:28
uh and then you have an Enterprise agreement with Azure and maybe you
13:31
agreement with Azure and maybe you
13:31
agreement with Azure and maybe you already have all your emails inside of O
13:34
already have all your emails inside of O
13:34
already have all your emails inside of O 365 so you are in the cloud anyway so
13:37
365 so you are in the cloud anyway so
13:37
365 so you are in the cloud anyway so that's all fine but then there of course
13:39
that's all fine but then there of course
13:39
that's all fine but then there of course are use cases where you maybe really
13:42
are use cases where you maybe really
13:42
are use cases where you maybe really have another level of privacy like in
13:46
have another level of privacy like in
13:46
have another level of privacy like in legal cases or in tax situations right
13:49
legal cases or in tax situations right
13:49
legal cases or in tax situations right where you really cannot give out all
13:51
where you really cannot give out all
13:51
where you really cannot give out all those uh data uh and then we need to
13:54
those uh data uh and then we need to
13:54
those uh data uh and then we need to look at maybe not so large models but a
13:57
look at maybe not so large models but a
13:57
look at maybe not so large models but a little bit smaller models that we can a
14:01
little bit smaller models that we can a
14:01
little bit smaller models that we can a fit onto our hardware and into our
14:04
fit onto our hardware and into our
14:04
fit onto our hardware and into our existing uh infrastructure and still
14:09
existing uh infrastructure and still
14:09
existing uh infrastructure and still will be able to fulfill our use cases so
14:12
will be able to fulfill our use cases so
14:12
will be able to fulfill our use cases so essentially using the same technology
14:14
essentially using the same technology
14:14
essentially using the same technology except with a completely different data
14:16
except with a completely different data
14:16
except with a completely different data set your customer's data specifically
14:19
set your customer's data specifically
14:19
set your customer's data specifically and it's
14:20
and it's it's no no we actually also use
14:25
it's no no we actually also use
14:25
it's no no we actually also use different models so there are a lot of
14:29
different models so there are a lot of
14:29
different models so there are a lot of um offerings out there yeah you can
14:32
um offerings out there yeah you can
14:32
um offerings out there yeah you can download them there is open AI you
14:34
download them there is open AI you
14:34
download them there is open AI you cannot download that model it's just a
14:38
cannot download that model it's just a
14:38
cannot download that model it's just a an API it's an API but the model is is
14:41
an API it's an API but the model is is
14:41
an API it's an API but the model is is too big you couldn't download it if you
14:43
too big you couldn't download it if you
14:43
too big you couldn't download it if you tried anyway right the same goes for
14:46
tried anyway right the same goes for
14:46
tried anyway right the same goes for Azure open AI but then there are others
14:49
Azure open AI but then there are others
14:49
Azure open AI but then there are others from the open source uh world like M
14:53
from the open source uh world like M
14:53
from the open source uh world like M like llama from meta and like f
14:59
like llama from meta and like f
14:59
like llama from meta and like f uh
15:00
uh Phi f 2 and F three from Microsoft you
15:05
Phi f 2 and F three from Microsoft you
15:05
Phi f 2 and F three from Microsoft you could run them or you could host them in
15:07
could run them or you could host them in
15:07
could run them or you could host them in the cloud and you can do that also in
15:09
the cloud and you can do that also in
15:09
the cloud and you can do that also in any of the Y prominent Cloud providers
15:13
any of the Y prominent Cloud providers
15:13
any of the Y prominent Cloud providers but you can also just download them and
15:16
but you can also just download them and
15:16
but you can also just download them and let them run but then those are still
15:19
let them run but then those are still
15:19
let them run but then those are still just the bare bones models yeah now you
15:22
just the bare bones models yeah now you
15:22
just the bare bones models yeah now you need to have the solution again in place
15:25
need to have the solution again in place
15:25
need to have the solution again in place to let the model know your data MH and
15:29
to let the model know your data MH and
15:29
to let the model know your data MH and there are basically two approaches Y in
15:33
there are basically two approaches Y in
15:33
there are basically two approaches Y in order to let the model know your your
15:36
order to let the model know your your
15:36
order to let the model know your your domain in terms of um what is your
15:39
domain in terms of um what is your
15:39
domain in terms of um what is your domain language what is your domain
15:42
domain language what is your domain
15:42
domain language what is your domain thinking yeah what is the way that your
15:44
thinking yeah what is the way that your
15:44
thinking yeah what is the way that your domain is expressing things then you are
15:47
domain is expressing things then you are
15:47
domain is expressing things then you are doing fine-tuning of the model yep
15:49
doing fine-tuning of the model yep
15:49
doing fine-tuning of the model yep that's quite expensive still but it's
15:51
that's quite expensive still but it's
15:51
that's quite expensive still but it's getting better the other part again is
15:54
getting better the other part again is
15:54
getting better the other part again is doing R the retrieval augmented
15:56
doing R the retrieval augmented
15:57
doing R the retrieval augmented generation where you just um have your
15:59
generation where you just um have your
15:59
generation where you just um have your data sitting in a database or in a blob
16:02
data sitting in a database or in a blob
16:02
data sitting in a database or in a blob storage or on a disk and then you're
16:04
storage or on a disk and then you're
16:05
storage or on a disk and then you're putting it into a vector database and
16:07
putting it into a vector database and
16:07
putting it into a vector database and then you can do a semantic search on
16:08
then you can do a semantic search on
16:08
then you can do a semantic search on that and again then you get back the
16:10
that and again then you get back the
16:10
that and again then you get back the data and talk to your model yeah and now
16:13
data and talk to your model yeah and now
16:13
data and talk to your model yeah and now the interesting part is there is an API
16:17
the interesting part is there is an API
16:17
the interesting part is there is an API standard more or less which is the open
16:20
standard more or less which is the open
16:20
standard more or less which is the open AI
16:23
API open AI a almost the entire alphabet right there
16:27
a almost the entire alphabet right there
16:27
a almost the entire alphabet right there oh
16:31
and each uh um and everybody and his
16:35
and each uh um and everybody and his
16:35
and each uh um and everybody and his mother more or less is now trying to
16:37
mother more or less is now trying to
16:37
mother more or less is now trying to mimic the open AI API for their own uh
16:40
mimic the open AI API for their own uh
16:40
mimic the open AI API for their own uh models or model hosting Solutions so
16:43
models or model hosting Solutions so
16:43
models or model hosting Solutions so that you are able to to switch between
16:47
that you are able to to switch between
16:47
that you are able to to switch between the large ones and the hosted ones and
16:50
the large ones and the hosted ones and
16:50
the large ones and the hosted ones and the large ones and the small ones and
16:52
the large ones and the small ones and
16:52
the large ones and the small ones and the hosted ones and so on and so forth
16:53
the hosted ones and so on and so forth
16:53
the hosted ones and so on and so forth so that you have quite a flexibility on
16:57
so that you have quite a flexibility on
16:57
so that you have quite a flexibility on the client and on the assuming side
17:00
the client and on the assuming side
17:00
the client and on the assuming side yeah that's
17:03
yeah that's good um so so now you can pick and
17:05
good um so so now you can pick and
17:05
good um so so now you can pick and choose and and so what when would you
17:08
choose and and so what when would you
17:08
choose and and so what when would you choose to have a like you I think you
17:10
choose to have a like you I think you
17:10
choose to have a like you I think you were alluding to it like when your data
17:12
were alluding to it like when your data
17:12
were alluding to it like when your data is maybe not so public when there's a
17:14
is maybe not so public when there's a
17:14
is maybe not so public when there's a reason like then then you might you
17:15
reason like then then you might you
17:15
reason like then then you might you might not want to move that data to the
17:17
might not want to move that data to the
17:17
might not want to move that data to the cloud you want to have it off line yes
17:20
cloud you want to have it off line yes
17:20
cloud you want to have it off line yes so one aspect is privacy yeah another
17:23
so one aspect is privacy yeah another
17:23
so one aspect is privacy yeah another another aspect is ubiquitousness which
17:26
another aspect is ubiquitousness which
17:26
another aspect is ubiquitousness which means I really want to have it
17:27
means I really want to have it
17:27
means I really want to have it everywhere I think we will have language
17:30
everywhere I think we will have language
17:31
everywhere I think we will have language models everywhere on our phones on our
17:33
models everywhere on our phones on our
17:33
models everywhere on our phones on our tablets on our laptops on edge devices
17:36
tablets on our laptops on edge devices
17:36
tablets on our laptops on edge devices in iot scenarios yeah here on the
17:40
in iot scenarios yeah here on the
17:40
in iot scenarios yeah here on the desktop in in the company and so on and
17:43
desktop in in the company and so on and
17:43
desktop in in the company and so on and so forth so that's number two number
17:46
so forth so that's number two number
17:46
so forth so that's number two number three is maybe
17:48
three is maybe specialization because those large
17:50
specialization because those large
17:51
specialization because those large language models are know it alls really
17:53
language models are know it alls really
17:53
language models are know it alls really like know it alls like a right and maybe
17:57
like know it alls like a right and maybe
17:57
like know it alls like a right and maybe you don't need those know it alls you
17:59
you don't need those know it alls you
17:59
you don't need those know it alls you really want an expert in finance you
18:02
really want an expert in finance you
18:02
really want an expert in finance you want an expert in Texas you want an
18:04
want an expert in Texas you want an
18:04
want an expert in Texas you want an expert in XY andz which is much more
18:08
expert in XY andz which is much more
18:08
expert in XY andz which is much more easier to uh achieve maybe when you are
18:11
easier to uh achieve maybe when you are
18:11
easier to uh achieve maybe when you are using a smaller model that makes sense
18:14
using a smaller model that makes sense
18:14
using a smaller model that makes sense like I think I think Scott Hanselman did
18:16
like I think I think Scott Hanselman did
18:16
like I think I think Scott Hanselman did it funny when he was asking he was in
18:18
it funny when he was asking he was in
18:18
it funny when he was asking he was in inside of Visual Studio code and he was
18:20
inside of Visual Studio code and he was
18:20
inside of Visual Studio code and he was talking to the co-pilot and he asked it
18:22
talking to the co-pilot and he asked it
18:22
talking to the co-pilot and he asked it for a recipe for an omelette or
18:23
for a recipe for an omelette or
18:23
for a recipe for an omelette or something like yeah maybe you need that
18:26
something like yeah maybe you need that
18:26
something like yeah maybe you need that but maybe not in the context of I'm in
18:28
but maybe not in the context of I'm in
18:28
but maybe not in the context of I'm in Studio code so maybe that language model
18:31
Studio code so maybe that language model
18:31
Studio code so maybe that language model knows a little bit of things that it not
18:34
knows a little bit of things that it not
18:34
knows a little bit of things that it not relevant so as far as I know co-pilot
18:37
relevant so as far as I know co-pilot
18:37
relevant so as far as I know co-pilot uses the Codex model which should not be
18:40
uses the Codex model which should not be
18:40
uses the Codex model which should not be able to answer that but I did not know
18:44
able to answer that but I did not know
18:44
able to answer that but I did not know somehow he got some I don't know exactly
18:46
somehow he got some I don't know exactly
18:46
somehow he got some I don't know exactly what he did but at least the point is
18:48
what he did but at least the point is
18:48
what he did but at least the point is still valid right when you're in a
18:50
still valid right when you're in a
18:50
still valid right when you're in a certain context you don't need all the
18:53
certain context you don't need all the
18:53
certain context you don't need all the world of other contexts you want to have
18:56
world of other contexts you want to have
18:56
world of other contexts you want to have a special specialist in the context that
18:58
a special specialist in the context that
18:58
a special specialist in the context that is Rel for what you're doing yeah yeah
19:00
is Rel for what you're doing yeah yeah
19:00
is Rel for what you're doing yeah yeah and then the last point I see is what we
19:04
and then the last point I see is what we
19:04
and then the last point I see is what we call Performance and performance not in
19:06
call Performance and performance not in
19:06
call Performance and performance not in the classical um sense so of course in
19:08
the classical um sense so of course in
19:08
the classical um sense so of course in the classical sense as in latency
19:10
the classical sense as in latency
19:10
the classical sense as in latency because sometimes talking to GPT
19:14
because sometimes talking to GPT
19:14
because sometimes talking to GPT whatever number can be quite quite slow
19:17
whatever number can be quite quite slow
19:17
whatever number can be quite quite slow although they are called turbo they are
19:18
although they are called turbo they are
19:18
although they are called turbo they are not turbo so latency is very very to
19:22
not turbo so latency is very very to
19:23
not turbo so latency is very very to call it turbo maybe It Isn't So
19:25
call it turbo maybe It Isn't So
19:25
call it turbo maybe It Isn't So turbo what's in a name what's in a name
19:28
turbo what's in a name what's in a name
19:28
turbo what's in a name what's in a name so it's about it's about uh latency but
19:32
so it's about it's about uh latency but
19:32
so it's about it's about uh latency but it's also about the quality it's about
19:34
it's also about the quality it's about
19:34
it's also about the quality it's about uh maybe that the data is from a a
19:38
uh maybe that the data is from a a
19:38
uh maybe that the data is from a a certain level of quality that it has a
19:41
certain level of quality that it has a
19:41
certain level of quality that it has a certain bias that you can maybe even
19:43
certain bias that you can maybe even
19:43
certain bias that you can maybe even tweak and um control you know yeah um so
19:49
tweak and um control you know yeah um so
19:49
tweak and um control you know yeah um so performance is a very very important um
19:52
performance is a very very important um
19:52
performance is a very very important um Factor yeah then of course you have a
19:55
Factor yeah then of course you have a
19:55
Factor yeah then of course you have a myriad of language models out there
19:58
myriad of language models out there
19:58
myriad of language models out there there is a community called hugging face
20:00
there is a community called hugging face
20:00
there is a community called hugging face have you ever heard of them I've heard
20:02
have you ever heard of them I've heard
20:02
have you ever heard of them I've heard about hugging face but I'm not sure what
20:04
about hugging face but I'm not sure what
20:04
about hugging face but I'm not sure what I know they have a very nice logo they
20:06
I know they have a very nice logo they
20:06
I know they have a very nice logo they have the that huging face face yeah yes
20:09
have the that huging face face yeah yes
20:09
have the that huging face face yeah yes from the Emojis exactly and this is more
20:12
from the Emojis exactly and this is more
20:12
from the Emojis exactly and this is more or less like GitHub for AI they have
20:14
or less like GitHub for AI they have
20:14
or less like GitHub for AI they have everything not just large language
20:16
everything not just large language
20:16
everything not just large language models and language models they have
20:18
models and language models they have
20:18
models and language models they have literally everything in Ai and uh from
20:21
literally everything in Ai and uh from
20:22
literally everything in Ai and uh from there you can really um well you can
20:24
there you can really um well you can
20:24
there you can really um well you can dive into the portal and you can come
20:28
dive into the portal and you can come
20:28
dive into the portal and you can come out weeks later right because it's so
20:30
out weeks later right because it's so
20:30
out weeks later right because it's so huge and they have what they call um
20:34
huge and they have what they call um
20:34
huge and they have what they call um well evaluation um reports and
20:38
well evaluation um reports and
20:38
well evaluation um reports and leaderboards where you can have a look
20:40
leaderboards where you can have a look
20:40
leaderboards where you can have a look at at which kind of use case and which
20:45
at at which kind of use case and which
20:45
at at which kind of use case and which kind of scenario which large or SM small
20:49
kind of scenario which large or SM small
20:49
kind of scenario which large or SM small language model is how good or not so
20:51
language model is how good or not so
20:51
language model is how good or not so good okay and the models that I
20:55
good okay and the models that I
20:55
good okay and the models that I personally and we with our customer work
20:57
personally and we with our customer work
20:58
personally and we with our customer work with are either in the cloud like open
21:00
with are either in the cloud like open
21:00
with are either in the cloud like open Ai and AER open a relevant situation yes
21:04
Ai and AER open a relevant situation yes
21:04
Ai and AER open a relevant situation yes then there is Mistral AI which is a
21:06
then there is Mistral AI which is a
21:06
then there is Mistral AI which is a French company which is very very
21:08
French company which is very very
21:08
French company which is very very interesting for
21:10
interesting for M they have they have a platform they
21:14
M they have they have a platform they
21:14
M they have they have a platform they called
21:18
platform so and then M also has open
21:22
platform so and then M also has open
21:22
platform so and then M also has open source models like mistal 7B and the B
21:26
source models like mistal 7B and the B
21:26
source models like mistal 7B and the B is for billion parameters and then
21:29
is for billion parameters and then
21:29
is for billion parameters and then there's Al so the mix and then there are
21:32
there's Al so the mix and then there are
21:32
there's Al so the mix and then there are derivatives like um like sair you can
21:37
derivatives like um like sair you can
21:37
derivatives like um like sair you can hear that like Mistral is a wind sapphir
21:41
hear that like Mistral is a wind sapphir
21:41
hear that like Mistral is a wind sapphir is a wind so they are all to to into
21:44
is a wind so they are all to to into
21:45
is a wind so they are all to to into those weird names also in the open
21:47
those weird names also in the open
21:47
those weird names also in the open source world and of course now we are
21:49
source world and of course now we are
21:49
source world and of course now we are also looking into llama 3 which has been
21:52
also looking into llama 3 which has been
21:52
also looking into llama 3 which has been released just a few weeks ago by meta
21:55
released just a few weeks ago by meta
21:55
released just a few weeks ago by meta and into 53 which has been um released
21:58
and into 53 which has been um released
21:58
and into 53 which has been um released to the open World by Microsoft so you
22:01
to the open World by Microsoft so you
22:01
to the open World by Microsoft so you kind of need to be to have and and have
22:04
kind of need to be to have and and have
22:04
kind of need to be to have and and have experts around to advise you on on which
22:07
experts around to advise you on on which
22:07
experts around to advise you on on which which model is is the one that we're
22:09
which model is is the one that we're
22:09
which model is is the one that we're going to use you know what I have the
22:12
going to use you know what I have the
22:12
going to use you know what I have the comfort that
22:14
comfort that I can do this
22:18
I can do this 100% yeah
22:20
100% yeah 247 more or less if if you can yeah
22:23
247 more or less if if you can yeah
22:23
247 more or less if if you can yeah exactly however much you you have energy
22:25
exactly however much you you have energy
22:25
exactly however much you you have energy for yeah which is good you know I have a
22:27
for yeah which is good you know I have a
22:27
for yeah which is good you know I have a lot of energy
22:29
lot of energy I know I do know that all right well I
22:32
I know I do know that all right well I
22:32
I know I do know that all right well I think this has been very uh enlightening
22:35
think this has been very uh enlightening
22:35
think this has been very uh enlightening and I think we could talk for um hours
22:37
and I think we could talk for um hours
22:37
and I think we could talk for um hours possibly days about this sure um but I'm
22:40
possibly days about this sure um but I'm
22:40
possibly days about this sure um but I'm going to have to ask you to come back
22:42
going to have to ask you to come back
22:42
going to have to ask you to come back and talk more about this topic at
22:45
and talk more about this topic at
22:45
and talk more about this topic at because we are out of time for this
22:47
because we are out of time for this
22:47
because we are out of time for this episode
22:49
episode n but anyway it's been a a bloody
22:52
n but anyway it's been a a bloody
22:52
n but anyway it's been a a bloody brilliant conversation I really really
22:54
brilliant conversation I really really
22:54
brilliant conversation I really really enjoyed it thank you so much uh
22:56
enjoyed it thank you so much uh
22:56
enjoyed it thank you so much uh Christian for being on the cloud show
22:58
Christian for being on the cloud show
22:58
Christian for being on the cloud show today and thank thank you for having me
23:00
today and thank thank you for having me
23:00
today and thank thank you for having me again oh absolutely our pleasure and and
23:02
again oh absolutely our pleasure and and
23:03
again oh absolutely our pleasure and and audience I hope you enjoyed this episode
23:05
audience I hope you enjoyed this episode
23:05
audience I hope you enjoyed this episode let us know in comments Etc and uh see
23:08
let us know in comments Etc and uh see
23:08
let us know in comments Etc and uh see you on the next episode of the cloud
23:10
you on the next episode of the cloud
23:10
you on the next episode of the cloud show thank you