0:02
hello hello everyone and welcome to yet
0:04
another episode of the cloud show of
0:07
course we have a new guest today and
0:09
it's going to be an interesting
0:10
conversation about a very hot topic
0:12
which is complicated for many and um
0:15
people feel like oh my god there's so
0:17
much I need to learn fortunately we have
0:19
a star of the show who knows all about
0:22
AI and the trends sort of shaping the
0:25
future of AI rather that's what we're
0:27
going to talk about three different
0:29
trends today and uh I'm going to do that
0:32
together with a with a good friend of
0:34
mine uh Mr renee Schult
0:52
hello my friend welcome to the cloud
0:55
hey Magnus thanks for having me
0:57
absolutely my pleasure um it's great to
1:00
to talk to you we saw each other just a
1:02
couple of weeks ago yeah it feels like
1:04
yesterday but it's been like four weeks
1:07
three weeks i know i know time flies it
1:09
does but it was it was good to see you
1:11
in in uh the same physical space but
1:14
right now you're not in this physical
1:16
space in my home office you are in what
1:18
your home office yeah it's my home
1:21
office in in Dson in Germany in Germany
1:23
all right so for the audience uh tell us
1:26
a little bit about yourself and what you
1:28
do sure sure um so my name is Renee i
1:31
work for a company called Reply uh we're
1:33
a system integrator um 16,000 employees
1:36
worldwide and you know quite a bit of
1:39
revenue um not a lot of folks know us
1:41
that's why I talk about it um but look
1:44
mainly headquartered in Italy but also
1:46
big presence in in Germany and actually
1:48
also in the US we have a thousand people
1:50
or so what I do is I work for our CTO
1:53
and I manage multiple groups which we
1:56
call communities of practices and they
1:59
work on innovative topics and you know
2:01
my teams are digital human synthetic
2:06
um spatial computing and quantum and
2:08
future of computing all the fun great
2:10
topics um and I enjoy this because you
2:14
know working with a lot of smart young
2:15
folks and you know doing some fun stuff
2:18
in the end but also some some good
2:20
bleeding edge stuff there i'm also a
2:22
content creator making bunch of you know
2:25
stuff like we all do on social media um
2:28
but also have my own podcast if you so
2:30
shameless pluck your digital dialogues
2:34
out family yeah we're we're right now
2:38
talking with lots of experts in humanoid
2:40
robotics so if that is your thing you
2:42
might want to check this out and yeah
2:44
what else um MVP and Microsoft regional
2:47
director just like you and all these
2:49
other things we do right it's like all
2:52
these things we do all right we keep
2:53
hanging out together that's that's for
2:55
sure and I I like that a lot so um All
2:58
right cool so now we know a little bit
3:00
about you so you just you just picked
3:02
the the easiest teams to lead i mean on
3:04
just a simple topics no complicated
3:06
anything at all in in the stuff that you
3:08
do in business right like quantum or you
3:11
know easy stuff easy easy stuff where
3:14
you always you know I keep on saying
3:15
it's the bleeding edge and sometimes we
3:17
really bleed on the edge I guess where
3:20
it comes from that's exactly you know
3:23
where that comes from
3:25
and you know what's the challenge also
3:27
is like these days when when we talk
3:29
about AI and you know bipe coding and
3:31
all the good stuff um there's not a lot
3:34
of data for these topics available which
3:37
these models could have been trained on
3:39
so you know it's still untouched field
3:42
so sometimes early in some areas yeah
3:44
for sure the bleeding edge the front
3:46
line where things happen and so so one
3:50
of those things that has has been
3:52
happening for quite a while now is is AI
3:55
and and I'm sure a lot of people are
3:57
confused about what's going on and like
4:00
where is this going and what is that
4:02
front line the edge and where is the
4:04
trend in in this space right the the
4:08
future of AI you gave me this wonderful
4:10
topic to talk about today said "Hey
4:12
let's talk about the the three key
4:14
trends shaping the future of AI." I was
4:16
like "That sounds really impressive
4:18
let's talk about that."
4:21
Right great let's let's uh let's do this
4:26
you know if we of course everyone might
4:29
have a different view on this but like
4:31
from the work I'm doing like these three
4:34
trends clearly emerge but let me give
4:36
you an intro to this first i think like
4:38
when we look at AI and as in particular
4:41
the applied usage of AI I think the two
4:44
biggest transformative applied use cases
4:47
will be one computer use agents so
4:51
basically AI agents that control your
4:54
computer in the end so think about you
4:56
can give it an old uh let's say
4:58
windforms application or some
5:00
complicated ERP system no one really
5:02
understands how to use and so you can
5:05
apply a computer use agent and will
5:07
automatically you know learn how to do
5:09
this and basically fulfill a task on
5:11
behalf of you and that will be a big
5:13
transformative automation in office work
5:16
for sure um it will be massive right if
5:18
you think about business process
5:20
outsourcing and and similar kind of
5:22
things like that might actually be you
5:24
know changing from like a nearshoring or
5:27
shoring into a silicon shoring right
5:29
where you leverage agents and you
5:31
control these agentic systems so that's
5:33
one part computer use agents right and
5:35
and the second is likely humanoid
5:37
robotics and humanoid robots we might
5:39
see them first in factories but like in
5:42
the next decade I'm pretty optimistic
5:43
that we will see them you know around us
5:46
pretty much um I keep on joking right in
5:48
Germany we might still use fax machines
5:50
um by 2030 when the rest of the world
5:53
employs humanoid robotics but maybe the
5:56
final grand touring test is when we get
5:58
a humanoid robot to actually operate a
6:00
fax machine i don't know maybe that's
6:03
We'll see we'll see yeah yeah know so
6:05
it's that when when we see ro robots
6:07
doing things in our environment
6:11
uh is that going to be a really bad day
6:13
for everyone who drives deliveries or
6:16
what's going to happen you think um you
6:19
know that the that's a common theme and
6:21
I think also the uh there's a
6:24
misconception um coming from especially
6:27
science fiction movies they ruined the
6:29
perception of a lot of people right when
6:31
you hear about robots you immediatically
6:33
see the Terminator intro scene in your
6:35
head right and you think I robot or
6:37
something like that exactly or I robot
6:39
and all these crazy crazy movies right
6:42
where it's always the bad and so that is
6:44
the perception a lot of folks are having
6:46
and that is not going to happen and also
6:48
they will not replace humans but we are
6:51
facing a challenge if you look at the
6:53
market and I was just talking with a
6:56
professor from from Italy that is very
6:58
in-depth with humanite robotics and he
7:00
was basically saying it's not just the
7:02
western world we see decline of birth
7:04
rate now happening in a lot of places in
7:06
the world and that actually means we
7:08
don't have enough workforce so who's
7:10
going to do the delivery driving uh you
7:12
know they're already like I don't know
7:14
how it is in in Sweden or the rest of
7:15
the world but for sure here in Germany
7:18
it's like truck drivers like they're
7:20
looking for truck drivers all over the
7:22
place they cannot even find enough of
7:24
them right so it's just going to um you
7:28
know fulfill the gap basically and it's
7:31
going to allow us to scale so
7:35
know it might mean that some things will
7:38
change but but likely the opportunities
7:42
for doing something is not going to go
7:44
away you're not going to be replaced by
7:46
a a silicon chip it's not that's not how
7:48
it works it might take over some some
7:50
work task that you used to do maybe um
7:53
but you will do something else it's not
7:55
like it's a threat uh you know that's
7:58
that's exactly the the right way to look
8:00
at it and in the end these are waves of
8:02
automation right we have all we also had
8:05
in the past right there's multiple waves
8:07
of automation the difference now is it
8:09
goes really fast and that's that's the
8:11
challenge we're facing right people need
8:14
to adapt to this new modern work world
8:17
and that is a challenge and so I think
8:20
the biggest challenge is actually the
8:21
cultural shift indeed that you mentioned
8:23
right but we we got to support everyone
8:27
and you know take everyone with us like
8:29
my recommendation always when I do talks
8:31
and so on is like since I don't know a
8:33
couple of years always keep on saying to
8:34
folks like try out these services try
8:37
out these tools experiment with those
8:40
because one thing is for sure like if
8:43
you have no clue about AI tools if
8:46
you're not if you're simply not
8:48
interested in that and say "Well this is
8:50
just going to go away." Then you're in
8:52
the same position like the folks in the
8:53
90s when they said "Oh the internet is
8:55
just a thing that's going to go away."
8:56
Right that's not like this is here to
8:59
stay and if you want to be relevant in
9:01
the market you got to use you got to
9:04
upskill yourself in that field that's
9:06
for sure yeah yeah no for sure no I
9:09
totally agree and and uh that's why you
9:12
know change is is scary for some people
9:15
and and we already know that so let's
9:17
let's try to um kind of look forward on
9:20
on these uh you were saying three things
9:23
right yeah yeah yes yes um so let's
9:25
let's dive into each of those right we
9:27
have multimodel models and this is the
9:31
first big trend we're seeing right now
9:33
um everyone knows generative AI right
9:35
it's a subcategory of neural networks in
9:38
the end that generate content and they
9:41
generate you know language models like
9:43
the chat GPTs of the world they generate
9:45
text um and of course you have other
9:47
models that generate images and so on
9:50
and now we also have models that can not
9:52
only generate images but they can also
9:55
have vision they have vision
9:56
capabilities so they can also kind of
9:58
understand these um images that you're
10:01
passing in and send them an image and
10:04
say create create use this image and
10:06
create you know me flying in an airplane
10:09
or something exactly and that's what we
10:12
have seen with the latest GPT4 image
10:14
generation which is pretty interesting
10:16
in itself from a technical standpoint
10:18
because they're using a different
10:20
approach than all these other image
10:21
generation models like you know stable
10:24
diffusion midjourney deli and so on they
10:26
use what is called diffusion method and
10:30
LLMs they are auto reggressive models
10:33
transformer architectures be behind this
10:36
and if you use the GPT4 image generation
10:38
it actually does not use a diffusion
10:40
model to generate the image But it also
10:41
does it auto reggressively so instead of
10:44
text token by token it's pixels by
10:46
pixels and that's why you get much
10:48
better consistency in the GPD4 image
10:50
generation also when you say hey take
10:52
this image and to put me into a plane
10:55
you really get reassembled right try to
10:57
do this with diffusion model much much
10:59
harder and that's the beauty of this
11:01
auto reggression but it also takes
11:02
longer then right so it's always kind of
11:04
trade-off there um but what I wanted to
11:06
say is these are multimodel models like
11:09
true multimodel models can interpret and
11:11
generate content across different
11:13
multiple modalities so image text audio
11:17
video and so on and the really good ones
11:20
they are actually have so-called joint
11:21
embeddings what that means is when you
11:24
train that model with these text tokens
11:26
you for example if you also want to do a
11:28
real-time audio model you basically give
11:30
the same audio snippet with it right so
11:32
internally that model learns then text
11:35
snippet you know combined to the audio
11:37
snippet and that's how these real time
11:38
audio models work you know like if
11:40
you're using Germany from Google or GPT
11:44
uh from OpenAI if you have the mobile
11:46
app you have this like real-time audio
11:47
conversation that's only possible with
11:49
these true multimodel models right so
11:52
different data types in different data
11:53
types out that's multimodality and we're
11:55
really just at the beginning um there's
11:57
a research from um Switzerland um some
12:01
university where they have 24 different
12:03
modalities like I could not even come up
12:05
with what are 24 different modalities
12:07
right but apparently there are so many
12:09
like depth data you know and all these
12:12
spectral image data and what have you
12:14
right there's all these different
12:17
more than I know yeah I was like what 24
12:20
like what could it be um I could
12:22
probably come up with 10 or so but hey
12:25
what do I know and that is multimodality
12:28
so different input types
12:31
in and also you can generate different
12:33
output types um that's very important
12:36
for these models to then also understand
12:38
the world and what we're also seeing is
12:42
of course you don't just use a single
12:44
model but you use actually multiple
12:46
instances even even it could be the same
12:49
model or could also be different models
12:51
because some of these language models
12:53
have you know certain specifics what
12:56
they're trained for like some are very
12:57
good at math or coding models are very
12:59
good at at copy you know text copy
13:01
writing that kind of a stuff and when
13:04
you combine those you end up with a
13:07
multi- aent system and Right what an
13:10
agent like you have multiple levels
13:12
there with agents right the first the
13:13
first experience is basically you're
13:15
using a chat interface and you have an
13:17
assistant kind of a thing the next level
13:20
is then the next level is then a single
13:22
agent where you have one one model but
13:24
you give it a specific role if you will
13:27
right and you give it also data
13:28
contextual data and so for example we
13:32
created an agent for RFPs right who
13:34
likes to answer RFPs of course we have
13:36
to do it um but no one likes that so we
13:38
created an agent that is very good and
13:40
has contextual data for RFPs like
13:43
request for proposals right like if
13:45
you're not familiar what what RFP is and
13:48
um that basically helps a lot and it's
13:51
specifically configured for that task so
13:54
that's a single agent and if we take
13:56
then a single agent and actually make
13:58
multiple versions of it or like other
14:00
agents then we have a multi- aent system
14:02
and that's where it really becomes
14:03
interesting so one example is take a
14:07
software project right like a software
14:08
development project you have all these
14:10
different roles you have software
14:11
developer you have QA analyst you have
14:14
project manager uh you have business
14:16
analyst whatever right depends always on
14:18
the project and the clients but you have
14:20
all these different roles now imagine
14:22
like each of these roles becomes an
14:24
agent and you as a developer actually
14:27
you are orchestrating uh a team of
14:30
agents and that's what the shift a lot
14:32
of folks need to make mentally Right
14:34
like developers will still be needed
14:36
that's for sure but instead of writing
14:39
lots of lines of code you will be an
14:41
orchestrator you will be rather a
14:43
technical program manager of an agentic
14:45
team and the crazy part is also um you
14:50
can apply this to whole organizations
14:52
right like your your whole company
14:54
organizational structure you can think
14:55
about like how can I implement agentic
14:58
teams in different roles wow and that's
15:00
that's going to be a massive shift
15:02
actually oh yeah all right so we're
15:06
going for that and uh there was another
15:08
trend about synthetic
15:11
um data yes yes synthetic data and that
15:14
that is becoming interesting for a
15:16
couple of points so first of all um if
15:19
you listen to one of the latest OpenAI
15:21
podcasts um with Sam Elkman and and some
15:23
of his colleagues they actually
15:25
mentioned a quite interesting thing that
15:28
the development of the novel language
15:30
models is not so much limited by the
15:33
compute power anymore right there they
15:35
say there there's enough compute power
15:37
available for them what they're rather
15:39
limited is which is surprising what
15:41
they're rather limited on is data and
15:44
most of the models are data bound
15:46
because they use so much data to train
15:47
these models and they're kind of
15:49
reaching a peak you know with the
15:51
available data they can use and so what
15:54
what some companies are doing and
15:55
Microsoft is very good there with the
15:57
fee four model or as our US colleagues
16:00
would say FI for um but you know the
16:03
Greek letter fee and basically that
16:06
model but also Google with Gemma what
16:08
they are leveraging also a lot of
16:09
synthetic data to actually train the
16:11
models so instead of using data from the
16:14
real world they're actually taking some
16:16
little data samples from the real world
16:18
and then augment it with synthetic data
16:20
so with you know artificially generated
16:22
data and that could be textual data but
16:25
it could also be visual data and that's
16:27
that's where it's also becoming
16:29
interesting when we talk about the other
16:30
trend about embodied intelligence or
16:32
physical AI thinking about robotics um
16:36
like synthetic data let me give you a
16:38
bunch of examples um before we dive into
16:41
the robotics aspect but one case where
16:43
we leveraged synthetic data was with
16:45
healthcare data and as you know it's of
16:48
course very sensitive data right patient
16:50
data you don't want to actually leverage
16:52
the real data too much or or just use a
16:57
and in of course what you lose if you
17:00
then train a model on that um you lose a
17:03
lot of generalization because if you
17:04
don't have enough data you know the
17:06
model will not be good so what you
17:08
leverage then is is certain algorithms
17:11
where you can take a very small data set
17:14
and augment and expand this into you
17:17
know with a higher diversity and you get
17:19
a higher variance in the end from the
17:21
statistical standpoint which means your
17:23
model will generalize much better and
17:25
that's what we actually did with
17:27
synthetic medical document generation
17:29
and we got like a what is it I have the
17:32
slide here it's like three times higher
17:34
variance right so three times better
17:37
by just using existing data and
17:39
augmenting this like you know adding
17:42
more stuff so that's pretty interesting
17:44
I think from the classic tabular textual
17:47
machine learning data but even even more
17:50
interesting I think is it when we think
17:52
about the um application of synthetic
17:55
data for computer vision and also for
17:57
robotics scenarios so Nvidia for example
18:00
is is pretty prominent there in the
18:02
space with Nvidia Omniverse uh where
18:04
they provide you kind of simulation
18:06
environment and we're also working with
18:08
clients for example we're replicating
18:11
warehouses uh or simulating certain um
18:15
stations certain packaging stations in
18:17
the warehouse and so on and we do this
18:19
in order to identify
18:22
let me make it more practical um so
18:25
there's a case with a client they need
18:27
to load a pallet you They produce goods
18:29
and they need to put boxes on a pallet
18:32
on a pallet ship it out so usual use
18:35
case and of course the pallet loading is
18:36
done by robotic arm right there's no
18:38
human on the pallet and so it's on the
18:41
factory floor well there's still a lot
18:43
of humans there for sure but you know
18:45
like who wants to I mean these are heavy
18:47
goods right so it's good that point it's
18:50
like they don't need to take the heavy
18:51
boxes and put it on the pallet the issue
18:53
only becomes is sometimes the robot
18:55
makes mistakes and puts the boxes on the
18:58
wrong order or something which means
19:00
then if if then someone comes in with a
19:02
forklift and picks it up it might fall
19:03
over and that is expensive with the
19:06
goods right easily 100k broken stuff or
19:08
so and it only happens every few weeks
19:10
or months and that's a challenge because
19:13
how would you detect this you don't have
19:14
the real world data available to train
19:16
your model so what we're doing is we're
19:18
simulating the whole thing in a 3D
19:20
environment and then we can simulate all
19:22
the different variations you can imagine
19:24
we can render a million different
19:26
permutations out of this with different
19:28
light with different shading different
19:30
backgrounds and all of this and this is
19:32
the data we then use to train a computer
19:34
vision model which will then detect the
19:37
issue before it's too late and that is
19:40
one case another one is also if you
19:42
think about like dangerous uh locations
19:45
think about a mine right like a coal
19:46
mine or whatever right where miners are
19:48
inside imagine if there's a fire
19:50
happening right and you want to send in
19:52
a robot and the robot needs to navigate
19:54
and needs to know the way and so on um
19:57
it's very hard to simulate to have this
20:00
train trending data on the real world
20:01
right when there's a fire in the mind
20:02
like when when would you get training
20:04
data you don't have a million fires to
20:06
to to with real data you have to right
20:10
that would be bad if you had a million
20:11
fires it would be really bad especially
20:15
but you would have enough data to train
20:16
your your your AI on it that would be
20:19
good but it would be terrible to have a
20:21
million fires yeah exactly so so don't
20:24
do this um what role do is um leverage
20:28
also Unreal Engine right or other like
20:30
3D rendering you know methodologies
20:33
these days if you look at some of the
20:35
stuff you know if look at some of the
20:37
games you probably have seen it's
20:39
incredible like the quality 3D rendering
20:41
so what we can do is we can send a robot
20:44
once into the mine with a laser scanner
20:47
and it does a 3D scan of the environment
20:49
then we take this 3D scan put it in
20:51
Unreal or whatever or Omniverse doesn't
20:54
matter and then we add artificial smoke
20:56
or artificial fire and all of this and
20:58
all different variations you can imagine
21:00
so it's all fake but the AI is is
21:03
trained on the the this generated
21:06
synthetic data exactly and then we have
21:09
we have the trained model and then we
21:12
take this and this is called sim tore so
21:14
we do a simulation first and then we
21:16
deploy it on the real robotic system and
21:19
then of course that's the final real
21:20
test if it actually works if it was
21:22
trained well um but that's that's a lot
21:25
of the methodologies we see these days
21:27
it's also called robotics gyms just if
21:29
you go to the gym to train right you can
21:31
have these robotic systems actually
21:33
train in these omniverse or other
21:34
environments cool and and that's when we
21:37
when we come into embodied intelligence
21:39
right we take That's right it's like
21:41
that's like the next uh the next thing
21:43
or like the five well out of the three
21:45
we're talking about today so let's
21:46
finish off strong on embodied
21:48
intelligence then what are we talking
21:50
about now yeah exactly so that's the
21:52
point where we take this um synthetic
21:55
data for example and train a robotic
21:56
system and that's is where we have
21:58
embodied intelligence or physical AI
22:00
right you have these these novel AI
22:03
algorithms these novel AI services now
22:06
applied into an embodiment and that
22:08
could be a quadrupled a four-legged
22:10
robot like the Boston dynamic spot or a
22:13
humanoid robot we also have one of those
22:15
like a bipedal robot that walks like a
22:18
human and so we leverage synthetic data
22:20
also to train these ones and then to
22:22
apply it so that's one part of the
22:24
physical AI uh sorry one part of the
22:26
embodied intelligence is the physical
22:28
embodiment the other one is a digital
22:30
embodiment where I have an avatar a
22:32
digital human you know like a like I
22:34
give a face to a chatbot if you will um
22:37
but I can also attach to this a
22:39
effective computing which we're also
22:41
working on think about you giving it a
22:43
personality um it's actually a fun
22:45
anecdote um you probably know the Inside
22:48
Out movie right this this famous
22:50
animated movie and like the the girl
22:52
there she has eight emotions in her head
22:54
and actually these eight emotions they
22:56
did not just come up with that that's
22:57
actually based on a psychological model
22:59
from um an expert called Paul Eggman and
23:02
so that's the Paul Eggman model psych
23:04
we're actually leveraging the same model
23:05
for our digital human so eight different
23:07
emotions are firing each of them is an
23:10
agent multi- aent system so let's say
23:12
the fear or the surprise agent they're
23:15
they're getting an input from the user
23:17
and then they're firing right and then
23:18
we have an orchestration agent which
23:20
then assembles the final answer and then
23:22
we have a system that reacts more
23:23
empathetically and has more you know
23:26
sentiment and emotion where you can as a
23:28
human uh you know talk talk to and also
23:31
feel more connected i would say wow
23:34
that's that's incredibly cool you know
23:37
what unfortunately we are running out of
23:40
time for this show we could have talked
23:43
about this for the longest time but
23:45
we'll stay true to form it's a 20-minut
23:47
show so for the audience we're This is
23:50
where we cut Renee off this time thank
23:52
you so much for being on the cloud show
23:54
today i appreciate it thanks for having
23:56
me and take care my friends and don't
23:58
worry just make sure you look into AI
24:00
services and stay up to date that's
24:02
right and I'll see you guys next time on