Best practices in building Cloud distributed systems | Software Architecture Conference

Name: Best practices in building Cloud distributed systems | Software Architecture Conference | Open Video
Uploaded: 2025-08-06T12:06:12+00:00
Duration: 29 min 16 s

0:03
okay so yeah so the main agenda for
0:06
okay so yeah so the main agenda for
0:06
okay so yeah so the main agenda for today is like we'll go through the
0:07
today is like we'll go through the
0:08
today is like we'll go through the introduction of the distributed system
0:09
introduction of the distributed system
0:09
introduction of the distributed system the characteristics and the types of
0:11
the characteristics and the types of
0:11
the characteristics and the types of distributed system and also the best
0:13
distributed system and also the best
0:13
distributed system and also the best practices we'll also look into some of
0:15
practices we'll also look into some of
0:15
practices we'll also look into some of the design patterns uh for the cloud
0:17
the design patterns uh for the cloud
0:17
the design patterns uh for the cloud distributed systems yeah so the general
0:20
distributed systems yeah so the general
0:20
distributed systems yeah so the general uh way how this works is like you
0:22
uh way how this works is like you
0:23
uh way how this works is like you basically have multiple nodes which are
0:25
basically have multiple nodes which are
0:25
basically have multiple nodes which are uh placed in different locations and
0:27
uh placed in different locations and
0:27
uh placed in different locations and then you communicate uh through this
0:29
then you communicate uh through this
0:30
then you communicate uh through this noes uh over the network so that's the
0:33
noes uh over the network so that's the
0:33
noes uh over the network so that's the difference between like you know having
0:35
difference between like you know having
0:35
difference between like you know having a centralized system and a distributed
0:36
a centralized system and a distributed
0:36
a centralized system and a distributed system so now we're going to talk more
0:39
system so now we're going to talk more
0:39
system so now we're going to talk more about the characteristics of uh the
0:42
about the characteristics of uh the
0:42
about the characteristics of uh the distributed systems right so there are
0:44
distributed systems right so there are
0:44
distributed systems right so there are like uh uh you know many but uh
0:47
like uh uh you know many but uh
0:47
like uh uh you know many but uh generally like four of them comes to my
0:49
generally like four of them comes to my
0:49
generally like four of them comes to my mind when I talk about the
0:50
mind when I talk about the
0:50
mind when I talk about the characteristics so let's look at uh each
0:52
characteristics so let's look at uh each
0:52
characteristics so let's look at uh each one of them the first one is the service
0:54
one of them the first one is the service
0:54
one of them the first one is the service Discovery so basically uh this is the as
0:58
Discovery so basically uh this is the as
0:58
Discovery so basically uh this is the as the nature of the distributed systems is
1:00
the nature of the distributed systems is
1:00
the nature of the distributed systems is that you have multiple uh services that
1:02
that you have multiple uh services that
1:02
that you have multiple uh services that are interacting with one another and
1:04
are interacting with one another and
1:04
are interacting with one another and they are completely decoupled they're
1:06
they are completely decoupled they're
1:06
they are completely decoupled they're not in the one location they are
1:08
not in the one location they are
1:08
not in the one location they are completely DEC so you need to know how
1:11
completely DEC so you need to know how
1:11
completely DEC so you need to know how the service going to communicate with
1:13
the service going to communicate with
1:13
the service going to communicate with other ones right so somehow they have to
1:15
other ones right so somehow they have to
1:15
other ones right so somehow they have to discover the other service so let's say
1:17
discover the other service so let's say
1:17
discover the other service so let's say you are scaling up or scaling down on
1:19
you are scaling up or scaling down on
1:19
you are scaling up or scaling down on one Services dynamically and now how
1:22
one Services dynamically and now how
1:22
one Services dynamically and now how will you able to communicate with them
1:24
will you able to communicate with them
1:24
will you able to communicate with them right so that's how the service
1:25
right so that's how the service
1:25
right so that's how the service Discovery is basically off so there are
1:27
Discovery is basically off so there are
1:27
Discovery is basically off so there are multiple different ways to actually do
1:29
multiple different ways to actually do
1:29
multiple different ways to actually do the service Discovery so one of it is
1:32
the service Discovery so one of it is
1:32
the service Discovery so one of it is DNS based Discovery and the other one is
1:34
DNS based Discovery and the other one is
1:34
DNS based Discovery and the other one is client side and the server side so in
1:36
client side and the server side so in
1:36
client side and the server side so in the DNS based uh Discovery what you do
1:38
the DNS based uh Discovery what you do
1:38
the DNS based uh Discovery what you do is basically you use the domain name
1:40
is basically you use the domain name
1:40
is basically you use the domain name service to map service names to IP
1:42
service to map service names to IP
1:42
service to map service names to IP address and ports that allows services
1:45
address and ports that allows services
1:45
address and ports that allows services to discover each other's location and
1:48
to discover each other's location and
1:48
to discover each other's location and for the client site Discovery you
1:50
for the client site Discovery you
1:50
for the client site Discovery you involve the client so basically the
1:52
involve the client so basically the
1:52
involve the client so basically the client has access to some sort of a
1:54
client has access to some sort of a
1:54
client has access to some sort of a registry with all the list of all
1:56
registry with all the list of all
1:56
registry with all the list of all Service locations and based on that it
1:58
Service locations and based on that it
1:58
Service locations and based on that it will identify which service to uh invoke
2:01
will identify which service to uh invoke
2:02
will identify which service to uh invoke and similarly this is on the server side
2:04
and similarly this is on the server side
2:04
and similarly this is on the server side so it will maintain a centralized
2:06
so it will maintain a centralized
2:06
so it will maintain a centralized service registry where all the services
2:08
service registry where all the services
2:08
service registry where all the services can come and register themselves and
2:10
can come and register themselves and
2:10
can come and register themselves and then you can query that registery to
2:12
then you can query that registery to
2:12
then you can query that registery to discover other
2:14
discover other
2:14
discover other services so the next slide uh is on the
2:17
services so the next slide uh is on the
2:17
services so the next slide uh is on the load load balancing so yeah so this is
2:20
load load balancing so yeah so this is
2:20
load load balancing so yeah so this is also another uh key important aspect in
2:23
also another uh key important aspect in
2:23
also another uh key important aspect in distributed systems when it comes to
2:25
distributed systems when it comes to
2:25
distributed systems when it comes to like load balancing right so how you are
2:27
like load balancing right so how you are
2:27
like load balancing right so how you are planning to distribute your load to mult
2:29
planning to distribute your load to mult
2:29
planning to distribute your load to mult multiple different uh system so you
2:31
multiple different uh system so you
2:31
multiple different uh system so you don't want to uh you know uh if you
2:34
don't want to uh you know uh if you
2:34
don't want to uh you know uh if you don't distribute the workloads properly
2:36
don't distribute the workloads properly
2:36
don't distribute the workloads properly right then you have this problem of uh
2:38
right then you have this problem of uh
2:39
right then you have this problem of uh using only one server and you will be uh
2:41
using only one server and you will be uh
2:42
using only one server and you will be uh prone to cause some server failures and
2:44
prone to cause some server failures and
2:44
prone to cause some server failures and there will be a performance impact as
2:45
there will be a performance impact as
2:45
there will be a performance impact as well so there are like basically two
2:47
well so there are like basically two
2:47
well so there are like basically two types of load balancing static and
2:50
types of load balancing static and
2:50
types of load balancing static and dynamic the static ones use a predefined
2:52
dynamic the static ones use a predefined
2:53
dynamic the static ones use a predefined set of rules to distribute the traffic
2:55
set of rules to distribute the traffic
2:55
set of rules to distribute the traffic well the dynamic one it quickly adapts
2:57
well the dynamic one it quickly adapts
2:57
well the dynamic one it quickly adapts to the change in the traffic volume load
3:00
to the change in the traffic volume load
3:00
to the change in the traffic volume load and the server availability and based on
3:02
and the server availability and based on
3:02
and the server availability and based on that it will pick whichever service uh
3:04
that it will pick whichever service uh
3:04
that it will pick whichever service uh is available it will automatically do
3:07
is available it will automatically do
3:07
is available it will automatically do the load balancing so there are couple
3:08
the load balancing so there are couple
3:08
the load balancing so there are couple of like pretty good U commonly used
3:11
of like pretty good U commonly used
3:11
of like pretty good U commonly used algorithms are there like Ron Robin is
3:13
algorithms are there like Ron Robin is
3:13
algorithms are there like Ron Robin is there where it will distribute the
3:15
there where it will distribute the
3:15
there where it will distribute the traffic evenly among the group of
3:17
traffic evenly among the group of
3:17
traffic evenly among the group of servers in a circular manner and then
3:19
servers in a circular manner and then
3:19
servers in a circular manner and then there's something called least
3:20
there's something called least
3:20
there's something called least connection algorithm uh which actually
3:23
connection algorithm uh which actually
3:23
connection algorithm uh which actually distributes the incoming traffic to the
3:25
distributes the incoming traffic to the
3:25
distributes the incoming traffic to the servers with the fewest active
3:27
servers with the fewest active
3:27
servers with the fewest active connections so let's say only one server
3:29
connections so let's say only one server
3:29
connections so let's say only one server is uh
3:30
is uh
3:30
is uh uh taking care of the load currently and
3:32
uh taking care of the load currently and
3:32
uh taking care of the load currently and there are other servers with the least
3:34
there are other servers with the least
3:35
there are other servers with the least active connections then it will
3:36
active connections then it will
3:36
active connections then it will distribute the traffic evenly to Target
3:38
distribute the traffic evenly to Target
3:38
distribute the traffic evenly to Target those servers so that way you're you're
3:41
those servers so that way you're you're
3:41
those servers so that way you're you're uh preventing any one server from being
3:43
uh preventing any one server from being
3:43
uh preventing any one server from being overwhelmed and uh reducing the downtime
3:47
overwhelmed and uh reducing the downtime
3:47
overwhelmed and uh reducing the downtime and there's something called a DNS based
3:50
and there's something called a DNS based
3:50
and there's something called a DNS based algorithm which again uses the IPS and
3:52
algorithm which again uses the IPS and
3:52
algorithm which again uses the IPS and the domain names to distribute the
3:54
the domain names to distribute the
3:54
the domain names to distribute the traffic so it quickly uh resolves uh
3:57
traffic so it quickly uh resolves uh
3:57
traffic so it quickly uh resolves uh different IP addresses or servers and B
3:59
different IP addresses or servers and B
3:59
different IP addresses or servers and B on that it will U uh re redirect the
4:02
on that it will U uh re redirect the
4:02
on that it will U uh re redirect the traffic so there are the industry
4:05
traffic so there are the industry
4:05
traffic so there are the industry standard ones are already available like
4:07
standard ones are already available like
4:07
standard ones are already available like I would always recommend not to reinvent
4:09
I would always recommend not to reinvent
4:09
I would always recommend not to reinvent your wheel uh whatever is available out
4:11
your wheel uh whatever is available out
4:12
your wheel uh whatever is available out of the box like you know you can use the
4:13
of the box like you know you can use the
4:13
of the box like you know you can use the Azure load balancer service mesh if your
4:16
Azure load balancer service mesh if your
4:16
Azure load balancer service mesh if your application is containerized use
4:17
application is containerized use
4:17
application is containerized use kubernetes it provides the built-in load
4:20
kubernetes it provides the built-in load
4:20
kubernetes it provides the built-in load balancing for Distributing node across
4:22
balancing for Distributing node across
4:22
balancing for Distributing node across multiple uh ports of services so
4:24
multiple uh ports of services so
4:24
multiple uh ports of services so whatever is available rather than
4:26
whatever is available rather than
4:26
whatever is available rather than Reinventing the will uh make sure you
4:28
Reinventing the will uh make sure you
4:28
Reinventing the will uh make sure you use the industry provided uh uh
4:32
use the industry provided uh uh
4:32
use the industry provided uh uh products so the next one is distributed
4:34
products so the next one is distributed
4:35
products so the next one is distributed tracing and logging right so this is
4:37
tracing and logging right so this is
4:37
tracing and logging right so this is very important because like you know you
4:39
very important because like you know you
4:39
very important because like you know you have multiple Services now in the inter
4:41
have multiple Services now in the inter
4:42
have multiple Services now in the inter it's intertwin in the distributed uh
4:44
it's intertwin in the distributed uh
4:44
it's intertwin in the distributed uh systems architecture now you wanted to
4:47
systems architecture now you wanted to
4:47
systems architecture now you wanted to make sure that uh how your requests are
4:50
make sure that uh how your requests are
4:50
make sure that uh how your requests are coming into your system like if there is
4:52
coming into your system like if there is
4:52
coming into your system like if there is any issue or any uh Services down right
4:55
any issue or any uh Services down right
4:55
any issue or any uh Services down right so you need to make sure that you you
4:58
so you need to make sure that you you
4:58
so you need to make sure that you you have the ability to look at the locks
5:00
have the ability to look at the locks
5:00
have the ability to look at the locks and also the request to identify any
5:03
and also the request to identify any
5:03
and also the request to identify any bottlenecks and if there is any issue
5:06
bottlenecks and if there is any issue
5:06
bottlenecks and if there is any issue should be able to debug it uh quickly so
5:08
should be able to debug it uh quickly so
5:08
should be able to debug it uh quickly so that's where the that's where the
5:10
that's where the that's where the
5:10
that's where the that's where the importance of the distributed tracing
5:12
importance of the distributed tracing
5:12
importance of the distributed tracing and logging comes in so there are lot of
5:14
and logging comes in so there are lot of
5:14
and logging comes in so there are lot of open- source distributed uh tracing
5:16
open- source distributed uh tracing
5:16
open- source distributed uh tracing systems available like uh you can use
5:18
systems available like uh you can use
5:18
systems available like uh you can use Zipkin ziger uh if you're going with the
5:20
Zipkin ziger uh if you're going with the
5:20
Zipkin ziger uh if you're going with the distributed logging you can use elk
5:23
distributed logging you can use elk
5:23
distributed logging you can use elk stack also uh so this way it will really
5:26
stack also uh so this way it will really
5:26
stack also uh so this way it will really help you to process each and every event
5:29
help you to process each and every event
5:29
help you to process each and every event that is occurring in the system and when
5:32
that is occurring in the system and when
5:32
that is occurring in the system and when you combine both the distributor tracing
5:34
you combine both the distributor tracing
5:34
you combine both the distributor tracing and logs right so you can easily
5:35
and logs right so you can easily
5:35
and logs right so you can easily identify any bottlenecks uh go through
5:38
identify any bottlenecks uh go through
5:38
identify any bottlenecks uh go through the diagnosing those errors and
5:40
the diagnosing those errors and
5:40
the diagnosing those errors and enhancing that overall system
5:42
enhancing that overall system
5:42
enhancing that overall system performance is good so and uh I should
5:45
performance is good so and uh I should
5:45
performance is good so and uh I should really appreciate the way that the uh
5:48
really appreciate the way that the uh
5:48
really appreciate the way that the uh the net Aspire team they have built this
5:50
the net Aspire team they have built this
5:50
the net Aspire team they have built this dashboard so this comes uh by default
5:52
dashboard so this comes uh by default
5:52
dashboard so this comes uh by default like if you are today if you're creating
5:54
like if you are today if you're creating
5:54
like if you are today if you're creating any uh uh net uh uh core web application
5:59
any uh uh net uh uh core web application
5:59
any uh uh net uh uh core web application right with the with onboarding to the
6:01
right with the with onboarding to the
6:01
right with the with onboarding to the Aspire it will by default give you a
6:03
Aspire it will by default give you a
6:03
Aspire it will by default give you a dashboard where you can go and look at
6:05
dashboard where you can go and look at
6:05
dashboard where you can go and look at each and every Trace request you can
6:08
each and every Trace request you can
6:08
each and every Trace request you can identify how your services are getting
6:10
identify how your services are getting
6:10
identify how your services are getting called from which IP address all these
6:13
called from which IP address all these
6:13
called from which IP address all these details will be available right out of
6:16
details will be available right out of
6:16
details will be available right out of the box you don't have to do anything
6:18
the box you don't have to do anything
6:18
the box you don't have to do anything and it's an easy way to identify if
6:20
and it's an easy way to identify if
6:20
and it's an easy way to identify if there is any services having trouble you
6:22
there is any services having trouble you
6:22
there is any services having trouble you can look at those uh logs and identify
6:25
can look at those uh logs and identify
6:25
can look at those uh logs and identify if there are any latency
6:27
if there are any latency
6:27
if there are any latency issues and the fourth one is uh on the
6:31
issues and the fourth one is uh on the
6:31
issues and the fourth one is uh on the service monitoring so service monitoring
6:33
service monitoring so service monitoring
6:33
service monitoring so service monitoring is very important because like now you
6:35
is very important because like now you
6:35
is very important because like now you have all this distributed systems
6:37
have all this distributed systems
6:37
have all this distributed systems connected over the network what happens
6:40
connected over the network what happens
6:40
connected over the network what happens if one of the servers goes down so you
6:42
if one of the servers goes down so you
6:42
if one of the servers goes down so you need to have some sort of uh way to find
6:46
need to have some sort of uh way to find
6:46
need to have some sort of uh way to find the health of the systems right so for
6:49
the health of the systems right so for
6:49
the health of the systems right so for that right we have lot of uh tools
6:51
that right we have lot of uh tools
6:51
that right we have lot of uh tools available again like Prometheus is there
6:53
available again like Prometheus is there
6:53
available again like Prometheus is there grafana dashboards are there so in our
6:55
grafana dashboards are there so in our
6:55
grafana dashboards are there so in our systems also we use grafana dashboards
6:58
systems also we use grafana dashboards
6:58
systems also we use grafana dashboards where we identify the node health and
7:00
where we identify the node health and
7:01
where we identify the node health and make sure if the nodes are down we will
7:03
make sure if the nodes are down we will
7:03
make sure if the nodes are down we will trigger some sort of notifications to
7:05
trigger some sort of notifications to
7:05
trigger some sort of notifications to our teams and we will also create some
7:07
our teams and we will also create some
7:07
our teams and we will also create some uh incidents and based on that we will
7:09
uh incidents and based on that we will
7:10
uh incidents and based on that we will identify this bottlenecks and we will
7:11
identify this bottlenecks and we will
7:11
identify this bottlenecks and we will fix it right before our customer or
7:13
fix it right before our customer or
7:13
fix it right before our customer or somebody complains so when building the
7:16
somebody complains so when building the
7:16
somebody complains so when building the distributed systems make sure that uh
7:18
distributed systems make sure that uh
7:18
distributed systems make sure that uh service monitoring like you have to
7:20
service monitoring like you have to
7:20
service monitoring like you have to understand the importance of health of
7:22
understand the importance of health of
7:22
understand the importance of health of each and every system as long as you are
7:25
each and every system as long as you are
7:25
each and every system as long as you are getting this notifications and uh
7:27
getting this notifications and uh
7:27
getting this notifications and uh managing the health I think you will
7:30
managing the health I think you will
7:30
managing the health I think you will have a reliable distributed
7:32
have a reliable distributed
7:32
have a reliable distributed system so let's look at some of the key
7:35
system so let's look at some of the key
7:35
system so let's look at some of the key benefits key benefits of uh using the
7:38
benefits key benefits of uh using the
7:38
benefits key benefits of uh using the distributed systems so the first one
7:40
distributed systems so the first one
7:40
distributed systems so the first one that comes to your mind is like you know
7:42
that comes to your mind is like you know
7:42
that comes to your mind is like you know the scalability right so there's uh
7:44
the scalability right so there's uh
7:44
the scalability right so there's uh different uh scalability options that
7:46
different uh scalability options that
7:46
different uh scalability options that are available so you can do uh vertical
7:49
are available so you can do uh vertical
7:49
are available so you can do uh vertical scaling or the horizontal scaling so
7:51
scaling or the horizontal scaling so
7:51
scaling or the horizontal scaling so vertical scaling is basically it
7:53
vertical scaling is basically it
7:53
vertical scaling is basically it involves uh increasing the hardware
7:55
involves uh increasing the hardware
7:55
involves uh increasing the hardware resources of a single machine which is
7:57
resources of a single machine which is
7:57
resources of a single machine which is often limited by the physical
7:59
often limited by the physical
7:59
often limited by the physical constraints because uh there is some
8:01
constraints because uh there is some
8:01
constraints because uh there is some sort of a restriction for you like you
8:03
sort of a restriction for you like you
8:03
sort of a restriction for you like you cannot go maximum uh number of codes or
8:07
cannot go maximum uh number of codes or
8:07
cannot go maximum uh number of codes or memory capacity other than it supports
8:09
memory capacity other than it supports
8:09
memory capacity other than it supports for your system right but uh when it
8:11
for your system right but uh when it
8:11
for your system right but uh when it comes to the horizontal scaling uh so
8:14
comes to the horizontal scaling uh so
8:14
comes to the horizontal scaling uh so instead of increasing your system now
8:16
instead of increasing your system now
8:16
instead of increasing your system now you can add in more uh systems to the uh
8:20
you can add in more uh systems to the uh
8:20
you can add in more uh systems to the uh to your service and you can scale out so
8:23
to your service and you can scale out so
8:23
to your service and you can scale out so that way you if if there is a new load
8:27
that way you if if there is a new load
8:27
that way you if if there is a new load or if there is a uh system failure with
8:30
or if there is a uh system failure with
8:30
or if there is a uh system failure with one service also you can automatically
8:33
one service also you can automatically
8:33
one service also you can automatically uh make use of the other uh other
8:35
uh make use of the other uh other
8:35
uh make use of the other uh other systems in the service so that's the
8:38
systems in the service so that's the
8:38
systems in the service so that's the advantage you get uh with the
8:39
advantage you get uh with the
8:39
advantage you get uh with the distributed
8:41
distributed
8:41
distributed systems and uh there is something called
8:44
systems and uh there is something called
8:44
systems and uh there is something called uh reliability right so the distributed
8:47
uh reliability right so the distributed
8:47
uh reliability right so the distributed systems are more resilient to failures
8:49
systems are more resilient to failures
8:49
systems are more resilient to failures compared to the centralized systems why
8:52
compared to the centralized systems why
8:52
compared to the centralized systems why because the data has been replicated
8:54
because the data has been replicated
8:54
because the data has been replicated across multiple nodes now so the failure
8:57
across multiple nodes now so the failure
8:57
across multiple nodes now so the failure of a single node or a set of node does
9:00
of a single node or a set of node does
9:00
of a single node or a set of node does not necessarily bring down the entire
9:02
not necessarily bring down the entire
9:03
not necessarily bring down the entire system because automatically the
9:04
system because automatically the
9:04
system because automatically the failover happens if if let's say you are
9:06
failover happens if if let's say you are
9:07
failover happens if if let's say you are doing some sort of a master sleeve uh
9:09
doing some sort of a master sleeve uh
9:09
doing some sort of a master sleeve uh architecture with this one primary node
9:11
architecture with this one primary node
9:11
architecture with this one primary node all the traffic is actually handled by
9:13
all the traffic is actually handled by
9:13
all the traffic is actually handled by that particular primary service and then
9:16
that particular primary service and then
9:16
that particular primary service and then there is suddenly some sort of U
9:19
there is suddenly some sort of U
9:19
there is suddenly some sort of U non-transient failure so automatically
9:22
non-transient failure so automatically
9:22
non-transient failure so automatically you'll fail over to the secondary one
9:24
you'll fail over to the secondary one
9:24
you'll fail over to the secondary one but still the system uh will be handling
9:27
but still the system uh will be handling
9:27
but still the system uh will be handling to the user request so the overall
9:29
to the user request so the overall
9:29
to the user request so the overall reliability of the system is there when
9:31
reliability of the system is there when
9:31
reliability of the system is there when you choose the distributed uh uh
9:36
you choose the distributed uh uh
9:36
you choose the distributed uh uh systems and uh the next one is basically
9:39
systems and uh the next one is basically
9:39
systems and uh the next one is basically on the performance so as long as you uh
9:43
on the performance so as long as you uh
9:43
on the performance so as long as you uh involve breaking down your complex
9:45
involve breaking down your complex
9:45
involve breaking down your complex workloads into the smaller manageable
9:47
workloads into the smaller manageable
9:47
workloads into the smaller manageable parts right now you take advantage of
9:50
parts right now you take advantage of
9:50
parts right now you take advantage of this multiple machines so sometimes you
9:53
this multiple machines so sometimes you
9:53
this multiple machines so sometimes you can distribute your work into small
9:56
can distribute your work into small
9:56
can distribute your work into small chunks where the primary will be um you
9:59
chunks where the primary will be um you
9:59
chunks where the primary will be um you know handling some task and then there
10:01
know handling some task and then there
10:02
know handling some task and then there are active secondaries which are like uh
10:05
are active secondaries which are like uh
10:05
are active secondaries which are like uh initially it can be idle but then if you
10:07
initially it can be idle but then if you
10:07
initially it can be idle but then if you have a complex task like you know doing
10:09
have a complex task like you know doing
10:09
have a complex task like you know doing some sort of large scale data processing
10:12
some sort of large scale data processing
10:12
some sort of large scale data processing or you know going through uh millions of
10:14
or you know going through uh millions of
10:14
or you know going through uh millions of U logs on your uh local system or a
10:17
U logs on your uh local system or a
10:17
U logs on your uh local system or a centralized Network share now you can
10:20
centralized Network share now you can
10:20
centralized Network share now you can take advantage of the uh secondaries
10:22
take advantage of the uh secondaries
10:22
take advantage of the uh secondaries also so that's how you will get more uh
10:24
also so that's how you will get more uh
10:25
also so that's how you will get more uh improved
10:27
improved
10:27
improved performance and uh yeah so let's look at
10:29
performance and uh yeah so let's look at
10:29
performance and uh yeah so let's look at some of the distributed system patterns
10:33
some of the distributed system patterns
10:33
some of the distributed system patterns so here there are like uh lot of them
10:36
so here there are like uh lot of them
10:36
so here there are like uh lot of them but I've captured like uh some uh seven
10:39
but I've captured like uh some uh seven
10:39
but I've captured like uh some uh seven to eight which I use regularly and uh
10:41
to eight which I use regularly and uh
10:41
to eight which I use regularly and uh the the most common ones so I'm going to
10:44
the the most common ones so I'm going to
10:44
the the most common ones so I'm going to go through each and every design pattern
10:46
go through each and every design pattern
10:46
go through each and every design pattern and let's analyze more so the first one
10:49
and let's analyze more so the first one
10:50
and let's analyze more so the first one is uh basically on the Ambassador
10:52
is uh basically on the Ambassador
10:52
is uh basically on the Ambassador pattern so uh think like uh let's say
10:56
pattern so uh think like uh let's say
10:56
pattern so uh think like uh let's say you're working as a director in a big
10:58
you're working as a director in a big
10:58
you're working as a director in a big company where you have a personal
11:00
company where you have a personal
11:00
company where you have a personal assistant who takes care of all the
11:03
assistant who takes care of all the
11:03
assistant who takes care of all the appointments all the communications so
11:05
appointments all the communications so
11:05
appointments all the communications so everybody who wants to book some time
11:07
everybody who wants to book some time
11:07
everybody who wants to book some time with you they will first go through the
11:09
with you they will first go through the
11:09
with you they will first go through the personal assistant right so that is
11:11
personal assistant right so that is
11:11
personal assistant right so that is nothing but your Ambassador pattern so
11:13
nothing but your Ambassador pattern so
11:13
nothing but your Ambassador pattern so you have your application code and then
11:15
you have your application code and then
11:15
you have your application code and then you will uh develop another um uh
11:18
you will uh develop another um uh
11:18
you will uh develop another um uh service which actually handles your
11:21
service which actually handles your
11:21
service which actually handles your application code to do logging or
11:24
application code to do logging or
11:24
application code to do logging or monitoring or retri service so all the
11:26
monitoring or retri service so all the
11:26
monitoring or retri service so all the communication acts uh or goes through
11:29
communication acts uh or goes through
11:29
communication acts uh or goes through via this Ambassador so it's just like a
11:32
via this Ambassador so it's just like a
11:32
via this Ambassador so it's just like a Gateway and it provides a simplified way
11:35
Gateway and it provides a simplified way
11:35
Gateway and it provides a simplified way to manage communication between the
11:37
to manage communication between the
11:37
to manage communication between the services so yeah so that's about the
11:39
services so yeah so that's about the
11:40
services so yeah so that's about the Ambassador pattern and the benefits you
11:41
Ambassador pattern and the benefits you
11:41
Ambassador pattern and the benefits you get is like reduce uh latency enhance
11:44
get is like reduce uh latency enhance
11:44
get is like reduce uh latency enhance security like the overall architecture
11:46
security like the overall architecture
11:46
security like the overall architecture of your uh uh distributed system is also
11:49
of your uh uh distributed system is also
11:49
of your uh uh distributed system is also improved but if the uh if your service
11:52
improved but if the uh if your service
11:52
improved but if the uh if your service grows right then uh it only works for
11:55
grows right then uh it only works for
11:55
grows right then uh it only works for the smaller distributed service but as
11:57
the smaller distributed service but as
11:57
the smaller distributed service but as soon as your netk as as soon as your
11:59
soon as your netk as as soon as your
11:59
soon as your netk as as soon as your distributed service grow then this has
12:01
distributed service grow then this has
12:01
distributed service grow then this has some bottlenecks also so let's look at
12:04
some bottlenecks also so let's look at
12:04
some bottlenecks also so let's look at the second one second one is my favorite
12:07
the second one second one is my favorite
12:07
the second one second one is my favorite one which is a circuit breaker pattern
12:09
one which is a circuit breaker pattern
12:09
one which is a circuit breaker pattern so this circuit breaker pattern is
12:12
so this circuit breaker pattern is
12:12
so this circuit breaker pattern is actually used in uh distributed systems
12:14
actually used in uh distributed systems
12:14
actually used in uh distributed systems to prevent casking failures so cascading
12:16
to prevent casking failures so cascading
12:16
to prevent casking failures so cascading so there are two types of failures right
12:18
so there are two types of failures right
12:18
so there are two types of failures right so one is the transient ones and the
12:20
so one is the transient ones and the
12:20
so one is the transient ones and the other ones are non-transient ones if it
12:22
other ones are non-transient ones if it
12:22
other ones are non-transient ones if it is a transient ones it can quickly
12:24
is a transient ones it can quickly
12:24
is a transient ones it can quickly recover and you can do just uh retry
12:27
recover and you can do just uh retry
12:27
recover and you can do just uh retry policies uh maybe
12:29
policies uh maybe
12:29
policies uh maybe linear one or an exponential some of the
12:32
linear one or an exponential some of the
12:32
linear one or an exponential some of the other time it will come up so that is
12:34
other time it will come up so that is
12:34
other time it will come up so that is nontransient uh that is the transient
12:36
nontransient uh that is the transient
12:36
nontransient uh that is the transient ones but for the cascading failures
12:38
ones but for the cascading failures
12:39
ones but for the cascading failures which are the trans which are the
12:40
which are the trans which are the
12:40
which are the trans which are the non-transient ones there you need to
12:42
non-transient ones there you need to
12:42
non-transient ones there you need to have a system to identify okay if I make
12:45
have a system to identify okay if I make
12:45
have a system to identify okay if I make a call to this continuously even though
12:47
a call to this continuously even though
12:47
a call to this continuously even though the system is down right if you're
12:49
the system is down right if you're
12:49
the system is down right if you're making continuously calls you're just
12:51
making continuously calls you're just
12:51
making continuously calls you're just simply wasting your resources so that's
12:53
simply wasting your resources so that's
12:53
simply wasting your resources so that's where the circuit braet circuit breaker
12:56
where the circuit braet circuit breaker
12:56
where the circuit braet circuit breaker pattern comes into picture so it it uh
13:00
pattern comes into picture so it it uh
13:00
pattern comes into picture so it it uh uh it is uh basically used in
13:02
uh it is uh basically used in
13:02
uh it is uh basically used in conjunction with the retry pattern and
13:04
conjunction with the retry pattern and
13:04
conjunction with the retry pattern and the timeout pattern so the beauty of it
13:07
the timeout pattern so the beauty of it
13:07
the timeout pattern so the beauty of it is basically it has like three uh three
13:10
is basically it has like three uh three
13:10
is basically it has like three uh three states let me go to the next slide so
13:12
states let me go to the next slide so
13:12
states let me go to the next slide so yeah so it has three uh States here
13:15
yeah so it has three uh States here
13:15
yeah so it has three uh States here which is the closed half open and open
13:19
which is the closed half open and open
13:19
which is the closed half open and open so consider that when the state is
13:22
so consider that when the state is
13:22
so consider that when the state is closed that means it is connected so
13:24
closed that means it is connected so
13:24
closed that means it is connected so whatever the calls that you make from
13:26
whatever the calls that you make from
13:26
whatever the calls that you make from service a to service B it will always go
13:29
service a to service B it will always go
13:29
service a to service B it will always go go through and you will get the response
13:30
go through and you will get the response
13:30
go through and you will get the response back from service B but let's say for
13:33
back from service B but let's say for
13:33
back from service B but let's say for some reason service B actually the
13:36
some reason service B actually the
13:36
some reason service B actually the performance or some because of some
13:38
performance or some because of some
13:38
performance or some because of some issue it's uh a it went into uh
13:42
issue it's uh a it went into uh
13:42
issue it's uh a it went into uh non-transient failure right so this is
13:45
non-transient failure right so this is
13:45
non-transient failure right so this is down now so even if you're making
13:46
down now so even if you're making
13:46
down now so even if you're making continuous calls from a you you're not
13:49
continuous calls from a you you're not
13:49
continuous calls from a you you're not going to get any response so that's
13:51
going to get any response so that's
13:51
going to get any response so that's where you can plug in the circuit
13:53
where you can plug in the circuit
13:53
where you can plug in the circuit breaker where it will automatically
13:55
breaker where it will automatically
13:55
breaker where it will automatically change the state from close to open
14:00
change the state from close to open
14:00
change the state from close to open right because it has reached certain
14:02
right because it has reached certain
14:02
right because it has reached certain failure threshold so there's no point in
14:05
failure threshold so there's no point in
14:05
failure threshold so there's no point in now invoking that service so it will go
14:08
now invoking that service so it will go
14:08
now invoking that service so it will go to the open State and then based on the
14:10
to the open State and then based on the
14:11
to the open State and then based on the timeout like you know let's say if you
14:12
timeout like you know let's say if you
14:12
timeout like you know let's say if you expect the service to be back online
14:14
expect the service to be back online
14:14
expect the service to be back online again in uh half a day or in one day
14:17
again in uh half a day or in one day
14:17
again in uh half a day or in one day right then you can add some timeouts and
14:19
right then you can add some timeouts and
14:19
right then you can add some timeouts and if the timeout has expired it will again
14:21
if the timeout has expired it will again
14:21
if the timeout has expired it will again check so in order to check now it will
14:24
check so in order to check now it will
14:24
check so in order to check now it will change its states to half open and then
14:27
change its states to half open and then
14:27
change its states to half open and then it will try to make another call to
14:29
it will try to make another call to
14:29
it will try to make another call to service B if at all if the service B
14:32
service B if at all if the service B
14:32
service B if at all if the service B comes back online again then based on
14:35
comes back online again then based on
14:35
comes back online again then based on the success counter it will again go
14:37
the success counter it will again go
14:38
the success counter it will again go back to the close yet and it will close
14:39
back to the close yet and it will close
14:39
back to the close yet and it will close the complete path if at all if it
14:41
the complete path if at all if it
14:41
the complete path if at all if it doesn't get the enough threshold it will
14:44
doesn't get the enough threshold it will
14:44
doesn't get the enough threshold it will automatically go back to the open and it
14:46
automatically go back to the open and it
14:46
automatically go back to the open and it will open so this way you're trying
14:48
will open so this way you're trying
14:48
will open so this way you're trying you're basically uh not uh U uh wasting
14:53
you're basically uh not uh U uh wasting
14:53
you're basically uh not uh U uh wasting your resources so as long as you know
14:55
your resources so as long as you know
14:55
your resources so as long as you know when the service is down and if it is a
14:58
when the service is down and if it is a
14:58
when the service is down and if it is a non-trans once circuit breaker pattern
15:01
non-trans once circuit breaker pattern
15:01
non-trans once circuit breaker pattern will definitely help you uh in you know
15:03
will definitely help you uh in you know
15:03
will definitely help you uh in you know not wasting your
15:05
not wasting your
15:05
not wasting your [Music]
15:08
resources yeah and the third one is
15:10
resources yeah and the third one is
15:10
resources yeah and the third one is basically the cqrs pattern so this
15:13
basically the cqrs pattern so this
15:13
basically the cqrs pattern so this design pattern that uh basically
15:14
design pattern that uh basically
15:15
design pattern that uh basically separates the read and write operations
15:17
separates the read and write operations
15:17
separates the read and write operations of a system it allows for better
15:19
of a system it allows for better
15:19
of a system it allows for better scalability and performance by
15:21
scalability and performance by
15:21
scalability and performance by optimizing the system for each type of
15:23
optimizing the system for each type of
15:23
optimizing the system for each type of operation so consider you have uh a big
15:26
operation so consider you have uh a big
15:26
operation so consider you have uh a big amazon e-commerce application right
15:29
amazon e-commerce application right
15:29
amazon e-commerce application right where you see like lot of uh read like
15:32
where you see like lot of uh read like
15:32
where you see like lot of uh read like lot of people will be browsing through
15:34
lot of people will be browsing through
15:34
lot of people will be browsing through uh the products and uh then if they like
15:38
uh the products and uh then if they like
15:38
uh the products and uh then if they like any product they will add add into that
15:40
any product they will add add into that
15:40
any product they will add add into that cart and then they will uh purchase the
15:42
cart and then they will uh purchase the
15:42
cart and then they will uh purchase the product but the user traffic is more on
15:45
product but the user traffic is more on
15:45
product but the user traffic is more on consuming like the reading and going
15:47
consuming like the reading and going
15:47
consuming like the reading and going through the products compared to actual
15:49
through the products compared to actual
15:49
through the products compared to actual purchases right so that's where you can
15:52
purchases right so that's where you can
15:52
purchases right so that's where you can actually uh segregate your uh workflow
15:56
actually uh segregate your uh workflow
15:56
actually uh segregate your uh workflow where you can write a separate uh
15:59
where you can write a separate uh
15:59
where you can write a separate uh uh handlers for doing the command
16:01
uh handlers for doing the command
16:01
uh handlers for doing the command invocation and separate for uh quiry
16:05
invocation and separate for uh quiry
16:05
invocation and separate for uh quiry invocation so that's how you will
16:06
invocation so that's how you will
16:06
invocation so that's how you will separate uh the read from wres and then
16:11
separate uh the read from wres and then
16:11
separate uh the read from wres and then from at the database side you also need
16:13
from at the database side you also need
16:13
from at the database side you also need to maintain eventual consistency like if
16:15
to maintain eventual consistency like if
16:15
to maintain eventual consistency like if there has been any operations on the
16:17
there has been any operations on the
16:17
there has been any operations on the right side you you just need to make
16:19
right side you you just need to make
16:19
right side you you just need to make sure that you have the entire data
16:21
sure that you have the entire data
16:21
sure that you have the entire data flowing into the the read database as
16:23
flowing into the the read database as
16:23
flowing into the the read database as well so that's how you'll maintain the
16:24
well so that's how you'll maintain the
16:24
well so that's how you'll maintain the consistency but if you look at the
16:26
consistency but if you look at the
16:26
consistency but if you look at the architecture your code looks really
16:28
architecture your code looks really
16:28
architecture your code looks really really good like if you're jumping into
16:31
really good like if you're jumping into
16:31
really good like if you're jumping into any uh project with lot of distributed
16:34
any uh project with lot of distributed
16:34
any uh project with lot of distributed systems when you separate your methods
16:37
systems when you separate your methods
16:37
systems when you separate your methods and separate your calls into the cqrs
16:39
and separate your calls into the cqrs
16:39
and separate your calls into the cqrs pattern it's so easy to
16:44
follow yeah and the next one is the
16:46
follow yeah and the next one is the
16:46
follow yeah and the next one is the leader election pattern so here what
16:49
leader election pattern so here what
16:49
leader election pattern so here what you're trying to do is you have bunch of
16:51
you're trying to do is you have bunch of
16:51
you're trying to do is you have bunch of notes right so each nodes will actually
16:55
notes right so each nodes will actually
16:55
notes right so each nodes will actually uh uh try to elect a leader so this
16:58
uh uh try to elect a leader so this
16:58
uh uh try to elect a leader so this leader node is responsible for
17:00
leader node is responsible for
17:00
leader node is responsible for coordinating and executing task it is
17:03
coordinating and executing task it is
17:03
coordinating and executing task it is used to actually uh build highly
17:05
used to actually uh build highly
17:05
used to actually uh build highly available systems which can tolerate
17:08
available systems which can tolerate
17:08
available systems which can tolerate failures so what happens is when if at
17:11
failures so what happens is when if at
17:11
failures so what happens is when if at all if the leader node goes down then
17:13
all if the leader node goes down then
17:13
all if the leader node goes down then again all the nodes will try to elect a
17:16
again all the nodes will try to elect a
17:16
again all the nodes will try to elect a new one so the main advantage uh here is
17:19
new one so the main advantage uh here is
17:19
new one so the main advantage uh here is all the node all the exchanges all the
17:22
all the node all the exchanges all the
17:22
all the node all the exchanges all the decision making everything happens via
17:25
decision making everything happens via
17:25
decision making everything happens via the leader modde so that way you can
17:27
the leader modde so that way you can
17:27
the leader modde so that way you can avoid conflicts and the decision making
17:29
avoid conflicts and the decision making
17:29
avoid conflicts and the decision making is completely owned by the leader
17:33
node and the next pattern is on the
17:36
node and the next pattern is on the
17:36
node and the next pattern is on the event sourcing pattern so this is also
17:38
event sourcing pattern so this is also
17:38
event sourcing pattern so this is also very uh uh important one uh where what
17:41
very uh uh important one uh where what
17:41
very uh uh important one uh where what happens is basically you you have some
17:43
happens is basically you you have some
17:43
happens is basically you you have some changes in your system right so you
17:46
changes in your system right so you
17:46
changes in your system right so you handle those uh system changes as events
17:50
handle those uh system changes as events
17:50
handle those uh system changes as events so it keeps a journal of all this Live
17:53
so it keeps a journal of all this Live
17:53
so it keeps a journal of all this Live Events whatever is happening in your
17:55
Events whatever is happening in your
17:55
Events whatever is happening in your system and it makes a record uh into
17:58
system and it makes a record uh into
17:58
system and it makes a record uh into this Journal so that way when you want
18:01
this Journal so that way when you want
18:01
this Journal so that way when you want to actually go back and do some sort of
18:04
to actually go back and do some sort of
18:04
to actually go back and do some sort of auditing or do time travel debugging
18:06
auditing or do time travel debugging
18:06
auditing or do time travel debugging let's say some issue happened in your
18:08
let's say some issue happened in your
18:08
let's say some issue happened in your system if you want to identify rather
18:10
system if you want to identify rather
18:10
system if you want to identify rather than just updating the database record
18:12
than just updating the database record
18:12
than just updating the database record if it's Shoring the actual event what is
18:15
if it's Shoring the actual event what is
18:15
if it's Shoring the actual event what is happening right then it's an easy way
18:17
happening right then it's an easy way
18:17
happening right then it's an easy way for you to debug and understand what
18:19
for you to debug and understand what
18:19
for you to debug and understand what exactly happened during that point so
18:22
exactly happened during that point so
18:22
exactly happened during that point so one uh main example is uh think about
18:25
one uh main example is uh think about
18:25
one uh main example is uh think about the Git Version Control right so where
18:28
the Git Version Control right so where
18:28
the Git Version Control right so where you you for every commit that you do it
18:32
you you for every commit that you do it
18:32
you you for every commit that you do it will actually present a change so it's
18:34
will actually present a change so it's
18:34
will actually present a change so it's easy for you to actually look at each
18:36
easy for you to actually look at each
18:36
easy for you to actually look at each and every On's change and it also gives
18:39
and every On's change and it also gives
18:39
and every On's change and it also gives an ability for you to revert back and
18:42
an ability for you to revert back and
18:42
an ability for you to revert back and without losing your changes or without
18:44
without losing your changes or without
18:44
without losing your changes or without having any conflicts so similarly in the
18:46
having any conflicts so similarly in the
18:46
having any conflicts so similarly in the distributed uh systems world you can use
18:49
distributed uh systems world you can use
18:49
distributed uh systems world you can use the Event Source pattern for
18:53
that and the next one is the publisher
18:55
that and the next one is the publisher
18:55
that and the next one is the publisher and subscriber pattern so yeah most of
18:57
and subscriber pattern so yeah most of
18:57
and subscriber pattern so yeah most of the distributed systems like you know in
18:59
the distributed systems like you know in
18:59
the distributed systems like you know in one or the other way they use the uh pup
19:01
one or the other way they use the uh pup
19:01
one or the other way they use the uh pup sub model where uh it actually decouples
19:04
sub model where uh it actually decouples
19:04
sub model where uh it actually decouples your systems like you you don't have to
19:06
your systems like you you don't have to
19:06
your systems like you you don't have to worry about who is your publisher who is
19:08
worry about who is your publisher who is
19:09
worry about who is your publisher who is your subscriber so some sort of uh
19:11
your subscriber so some sort of uh
19:11
your subscriber so some sort of uh information is actually published uh
19:14
information is actually published uh
19:14
information is actually published uh into this um The Event Center or event
19:16
into this um The Event Center or event
19:16
into this um The Event Center or event broker where it will handle all the uh
19:20
broker where it will handle all the uh
19:20
broker where it will handle all the uh messages coming in and then whoever
19:23
messages coming in and then whoever
19:23
messages coming in and then whoever wants to listen to it right so they will
19:25
wants to listen to it right so they will
19:25
wants to listen to it right so they will go and they will subscribe to it and one
19:28
go and they will subscribe to it and one
19:29
go and they will subscribe to it and one beautiful advantage that you get is like
19:30
beautiful advantage that you get is like
19:30
beautiful advantage that you get is like let's say you have multiple distributed
19:32
let's say you have multiple distributed
19:32
let's say you have multiple distributed systems where you went ahead and changed
19:35
systems where you went ahead and changed
19:35
systems where you went ahead and changed your profile name and picture uh in the
19:38
your profile name and picture uh in the
19:38
your profile name and picture uh in the main portal right in in let's say in
19:40
main portal right in in let's say in
19:40
main portal right in in let's say in your profile page this should be
19:42
your profile page this should be
19:42
your profile page this should be automatically reflected in multiple
19:44
automatically reflected in multiple
19:44
automatically reflected in multiple different uh uh uh Services of the
19:47
different uh uh uh Services of the
19:47
different uh uh uh Services of the systems even if you go to any other page
19:49
systems even if you go to any other page
19:50
systems even if you go to any other page right it should automatically gets
19:52
right it should automatically gets
19:52
right it should automatically gets reflected why because it will make this
19:54
reflected why because it will make this
19:54
reflected why because it will make this change and it will uh submit this
19:57
change and it will uh submit this
19:57
change and it will uh submit this information into the uh event uh broker
19:59
information into the uh event uh broker
19:59
information into the uh event uh broker or Event Center and then whoever is
20:02
or Event Center and then whoever is
20:02
or Event Center and then whoever is Whoever has subscribed to this they will
20:04
Whoever has subscribed to this they will
20:04
Whoever has subscribed to this they will automatically get a notification and
20:06
automatically get a notification and
20:06
automatically get a notification and then they will look at this message and
20:08
then they will look at this message and
20:08
then they will look at this message and based on that they will automatically
20:09
based on that they will automatically
20:09
based on that they will automatically change so that's the beauty of a
20:12
change so that's the beauty of a
20:12
change so that's the beauty of a publisher and subscriber
20:15
publisher and subscriber
20:15
publisher and subscriber model so let's look at some of the types
20:18
model so let's look at some of the types
20:18
model so let's look at some of the types of distributed systems right so there
20:21
of distributed systems right so there
20:21
of distributed systems right so there are multiple but I've captured like uh
20:24
are multiple but I've captured like uh
20:24
are multiple but I've captured like uh the the most used uh ones here so
20:27
the the most used uh ones here so
20:27
the the most used uh ones here so there's something called The Hub and
20:29
there's something called The Hub and
20:29
there's something called The Hub and spoke architecture so where it's
20:32
spoke architecture so where it's
20:32
spoke architecture so where it's actually uh it's very simple and
20:34
actually uh it's very simple and
20:34
actually uh it's very simple and efficient way to manage the
20:35
efficient way to manage the
20:35
efficient way to manage the communication in a network because there
20:37
communication in a network because there
20:37
communication in a network because there will be like a central Hub so all the
20:40
will be like a central Hub so all the
20:40
will be like a central Hub so all the communication happens via that Central
20:42
communication happens via that Central
20:42
communication happens via that Central Hub so which makes it easy to manage
20:45
Hub so which makes it easy to manage
20:45
Hub so which makes it easy to manage secure and troublesho but as soon as the
20:48
secure and troublesho but as soon as the
20:48
secure and troublesho but as soon as the system grows right then it will also
20:50
system grows right then it will also
20:50
system grows right then it will also become uh a single point of failure so
20:53
become uh a single point of failure so
20:53
become uh a single point of failure so there will be a bottleneck if the system
20:55
there will be a bottleneck if the system
20:55
there will be a bottleneck if the system is uh small enough then uh The Hub and
20:58
is uh small enough then uh The Hub and
20:58
is uh small enough then uh The Hub and spoke model model works for your needs
21:00
spoke model model works for your needs
21:00
spoke model model works for your needs but as soon as the system grows then it
21:02
but as soon as the system grows then it
21:02
but as soon as the system grows then it becomes a bottom
21:04
becomes a bottom
21:04
becomes a bottom neck and the other one is uh the peer
21:07
neck and the other one is uh the peer
21:07
neck and the other one is uh the peer to-peer so in the peerto peer right so
21:09
to-peer so in the peerto peer right so
21:09
to-peer so in the peerto peer right so basically there's no Central Hub all
21:12
basically there's no Central Hub all
21:12
basically there's no Central Hub all nodes are uh intertwine they communicate
21:17
nodes are uh intertwine they communicate
21:17
nodes are uh intertwine they communicate with each other directly so uh the
21:20
with each other directly so uh the
21:20
with each other directly so uh the problem here is like uh the the
21:22
problem here is like uh the the
21:22
problem here is like uh the the challenge is to actually how to manage
21:24
challenge is to actually how to manage
21:24
challenge is to actually how to manage and handle right so every node is like
21:27
and handle right so every node is like
21:27
and handle right so every node is like you know able to directly freely go and
21:29
you know able to directly freely go and
21:29
you know able to directly freely go and talk to each other let's say you have
21:31
talk to each other let's say you have
21:31
talk to each other let's say you have some uh centralized database where you
21:33
some uh centralized database where you
21:33
some uh centralized database where you need to uh update information now it
21:36
need to uh update information now it
21:36
need to uh update information now it gets trickier because the communications
21:38
gets trickier because the communications
21:38
gets trickier because the communications is happening uh node to node so other
21:40
is happening uh node to node so other
21:40
is happening uh node to node so other nodes are not aware of it so sometimes
21:44
nodes are not aware of it so sometimes
21:44
nodes are not aware of it so sometimes uh there is a problem with the eventual
21:45
uh there is a problem with the eventual
21:46
uh there is a problem with the eventual consistency right so you'll not be able
21:47
consistency right so you'll not be able
21:47
consistency right so you'll not be able to keep up the data uh so that other
21:50
to keep up the data uh so that other
21:50
to keep up the data uh so that other nodes can actually look into
21:52
nodes can actually look into
21:52
nodes can actually look into it so it works for the smaller systems
21:55
it so it works for the smaller systems
21:55
it so it works for the smaller systems basically and the third one is the
21:57
basically and the third one is the
21:57
basically and the third one is the messaging cu right so this is a highly
22:00
messaging cu right so this is a highly
22:00
messaging cu right so this is a highly scalable architecture where messages are
22:02
scalable architecture where messages are
22:02
scalable architecture where messages are sent to the queue and then consumed by
22:04
sent to the queue and then consumed by
22:04
sent to the queue and then consumed by more consumers like exactly what we
22:07
more consumers like exactly what we
22:07
more consumers like exactly what we discuss with the pup sub model so this
22:09
discuss with the pup sub model so this
22:09
discuss with the pup sub model so this provides a more scalability because if
22:12
provides a more scalability because if
22:12
provides a more scalability because if you want to increase your consumers you
22:15
you want to increase your consumers you
22:15
you want to increase your consumers you can go ahead and do that even if you see
22:18
can go ahead and do that even if you see
22:18
can go ahead and do that even if you see lot of uh um Publishers publishing the
22:21
lot of uh um Publishers publishing the
22:21
lot of uh um Publishers publishing the message you can also scale Publishers
22:23
message you can also scale Publishers
22:23
message you can also scale Publishers without impacting the consumers so
22:25
without impacting the consumers so
22:25
without impacting the consumers so that's how you'll get more scalability
22:28
that's how you'll get more scalability
22:28
that's how you'll get more scalability and flexibility and uh this is like you
22:30
and flexibility and uh this is like you
22:31
and flexibility and uh this is like you know one of the commonly used type of
22:33
know one of the commonly used type of
22:33
know one of the commonly used type of distributed
22:35
distributed
22:35
distributed systems and the last one is like hybrid
22:38
systems and the last one is like hybrid
22:38
systems and the last one is like hybrid like mix of all So based on your
22:40
like mix of all So based on your
22:40
like mix of all So based on your requirement like I've seen uh people
22:42
requirement like I've seen uh people
22:42
requirement like I've seen uh people doing uh uh you know mix of Both Worlds
22:46
doing uh uh you know mix of Both Worlds
22:46
doing uh uh you know mix of Both Worlds they have used like you know Hub Hub and
22:48
they have used like you know Hub Hub and
22:48
they have used like you know Hub Hub and spoke architecture for controlling uh
22:50
spoke architecture for controlling uh
22:50
spoke architecture for controlling uh the the control traffic and for
22:52
the the control traffic and for
22:53
the the control traffic and for peer-to-peer architecture they use for
22:54
peer-to-peer architecture they use for
22:54
peer-to-peer architecture they use for data traffic so as per your needs as per
22:57
data traffic so as per your needs as per
22:58
data traffic so as per your needs as per your requ requirements you can mix and
23:00
your requ requirements you can mix and
23:00
your requ requirements you can mix and match uh uh other types of distributed
23:03
match uh uh other types of distributed
23:03
match uh uh other types of distributed systems also into
23:05
systems also into
23:05
systems also into one so now let's get into uh cloud like
23:09
one so now let's get into uh cloud like
23:09
one so now let's get into uh cloud like now you have understood the distributor
23:10
now you have understood the distributor
23:10
now you have understood the distributor systems and the types and everything so
23:12
systems and the types and everything so
23:12
systems and the types and everything so now when you're designing for the cloud
23:14
now when you're designing for the cloud
23:14
now when you're designing for the cloud right so you also have to make sure uh
23:17
right so you also have to make sure uh
23:17
right so you also have to make sure uh few few things like uh for the cloud
23:19
few few things like uh for the cloud
23:19
few few things like uh for the cloud providers uh when you think about the
23:21
providers uh when you think about the
23:21
providers uh when you think about the skaing right so let's say uh you have
23:23
skaing right so let's say uh you have
23:23
skaing right so let's say uh you have created some sort of um uh batch nodes
23:26
created some sort of um uh batch nodes
23:26
created some sort of um uh batch nodes where you are getting new notes based on
23:29
where you are getting new notes based on
23:29
where you are getting new notes based on your traffic so uh you can write some
23:32
your traffic so uh you can write some
23:32
your traffic so uh you can write some sort of U uh formulas where based on the
23:36
sort of U uh formulas where based on the
23:36
sort of U uh formulas where based on the Node based on the load on your uh uh the
23:40
Node based on the load on your uh uh the
23:40
Node based on the load on your uh uh the batch node you can automatically scale
23:42
batch node you can automatically scale
23:42
batch node you can automatically scale up or down so you can also make sure
23:45
up or down so you can also make sure
23:45
up or down so you can also make sure that you you control the uh cost
23:48
that you you control the uh cost
23:48
that you you control the uh cost efficiency uh but also at the time when
23:50
efficiency uh but also at the time when
23:50
efficiency uh but also at the time when the load is very high you should be able
23:52
the load is very high you should be able
23:52
the load is very high you should be able to scale up and uh uh uh uh you know
23:56
to scale up and uh uh uh uh you know
23:56
to scale up and uh uh uh uh you know maintain that uh load health so that's
23:58
maintain that uh load health so that's
23:59
maintain that uh load health so that's how you can basically think about the
24:01
how you can basically think about the
24:01
how you can basically think about the Autos scaling so it has like more
24:03
Autos scaling so it has like more
24:03
Autos scaling so it has like more benefits like you can uh uh improve your
24:05
benefits like you can uh uh improve your
24:05
benefits like you can uh uh improve your application availability and also risk
24:08
application availability and also risk
24:08
application availability and also risk of any performance issues or down times
24:10
of any performance issues or down times
24:10
of any performance issues or down times due to the sudden traffic
24:14
spikes and the next one is about the
24:17
spikes and the next one is about the
24:17
spikes and the next one is about the health checks so when you have this
24:19
health checks so when you have this
24:19
health checks so when you have this distributed systems right so as I
24:21
distributed systems right so as I
24:21
distributed systems right so as I mentioned maintaining the health check
24:23
mentioned maintaining the health check
24:23
mentioned maintaining the health check is very very important because if at all
24:25
is very very important because if at all
24:25
is very very important because if at all if you don't know if there is any system
24:27
if you don't know if there is any system
24:27
if you don't know if there is any system that is actually uh went down because of
24:31
that is actually uh went down because of
24:31
that is actually uh went down because of some issue right then uh how do you
24:35
some issue right then uh how do you
24:35
some issue right then uh how do you actually recover from that how do you
24:37
actually recover from that how do you
24:37
actually recover from that how do you know that you know there is some
24:38
know that you know there is some
24:38
know that you know there is some actually a problem in a system right so
24:40
actually a problem in a system right so
24:40
actually a problem in a system right so that's where the health checks are very
24:42
that's where the health checks are very
24:42
that's where the health checks are very important there are multiple different
24:43
important there are multiple different
24:43
important there are multiple different ways you can do like active and passive
24:45
ways you can do like active and passive
24:45
ways you can do like active and passive checks uh but what what you can do is
24:48
checks uh but what what you can do is
24:48
checks uh but what what you can do is basically uh make sure you have your uh
24:51
basically uh make sure you have your uh
24:51
basically uh make sure you have your uh uh systems in such a way that there's
24:53
uh systems in such a way that there's
24:53
uh systems in such a way that there's some sort of monitoring some sort of
24:55
some sort of monitoring some sort of
24:55
some sort of monitoring some sort of health checks happening like if you go
24:57
health checks happening like if you go
24:57
health checks happening like if you go ahead with some of the industry uh
25:00
ahead with some of the industry uh
25:00
ahead with some of the industry uh standard products like kubernetes or
25:02
standard products like kubernetes or
25:02
standard products like kubernetes or service fabric where they will
25:04
service fabric where they will
25:04
service fabric where they will automatically publish the health event
25:06
automatically publish the health event
25:06
automatically publish the health event they will tell you that okay there is
25:08
they will tell you that okay there is
25:08
they will tell you that okay there is some problem in actual node and you will
25:11
some problem in actual node and you will
25:11
some problem in actual node and you will automatically see those nodes getting
25:13
automatically see those nodes getting
25:13
automatically see those nodes getting into Waring or error State and that's
25:16
into Waring or error State and that's
25:16
into Waring or error State and that's when you have to plug in your uh
25:19
when you have to plug in your uh
25:19
when you have to plug in your uh services to look into look into those
25:22
services to look into look into those
25:22
services to look into look into those errors and quickly recover from there so
25:25
errors and quickly recover from there so
25:25
errors and quickly recover from there so the health checks is very very important
25:29
the health checks is very very important
25:29
the health checks is very very important and the next one is security so when
25:31
and the next one is security so when
25:31
and the next one is security so when you're dealing with Cloud right so make
25:32
you're dealing with Cloud right so make
25:32
you're dealing with Cloud right so make sure that you you follow the best uh
25:35
sure that you you follow the best uh
25:35
sure that you you follow the best uh practices uh in terms of uh security so
25:37
practices uh in terms of uh security so
25:37
practices uh in terms of uh security so wherever required uh you need to encrypt
25:40
wherever required uh you need to encrypt
25:40
wherever required uh you need to encrypt your data make sure that uh you need to
25:43
your data make sure that uh you need to
25:43
your data make sure that uh you need to uh prevent any unauthorized access the
25:46
uh prevent any unauthorized access the
25:46
uh prevent any unauthorized access the access controls like when you're dealing
25:48
access controls like when you're dealing
25:48
access controls like when you're dealing with the storage accounts or uh um you
25:51
with the storage accounts or uh um you
25:51
with the storage accounts or uh um you know the passwords or anything use AKs
25:54
know the passwords or anything use AKs
25:54
know the passwords or anything use AKs or try to avoid uh using the sasis for
25:58
or try to avoid uh using the sasis for
25:58
or try to avoid uh using the sasis for accessing the storage accounts and uh
26:01
accessing the storage accounts and uh
26:01
accessing the storage accounts and uh make sure you do the threat monitoring
26:02
make sure you do the threat monitoring
26:02
make sure you do the threat monitoring so this helps to uh identify any
26:06
so this helps to uh identify any
26:06
so this helps to uh identify any security threats early uh in building
26:09
security threats early uh in building
26:09
security threats early uh in building your system and based on that you can
26:11
your system and based on that you can
26:11
your system and based on that you can actually uh address and you can uh
26:14
actually uh address and you can uh
26:14
actually uh address and you can uh improve your system so the security is a
26:16
improve your system so the security is a
26:16
improve your system so the security is a key aspect when dealing with the
26:18
key aspect when dealing with the
26:18
key aspect when dealing with the cloud-based uh distributed
26:24
systems yeah and the next one is the
26:26
systems yeah and the next one is the
26:26
systems yeah and the next one is the fall tolerance so this is basically
26:28
fall tolerance so this is basically
26:28
fall tolerance so this is basically uh it's an ability uh for the system to
26:31
uh it's an ability uh for the system to
26:31
uh it's an ability uh for the system to uh operate even in the event of a
26:35
uh operate even in the event of a
26:35
uh operate even in the event of a failure so some practices are like you
26:37
failure so some practices are like you
26:37
failure so some practices are like you know you can uh go with redundancy where
26:40
know you can uh go with redundancy where
26:40
know you can uh go with redundancy where you uh it's it's one way of like you
26:43
you uh it's it's one way of like you
26:43
you uh it's it's one way of like you know making sure that uh you have fall
26:46
know making sure that uh you have fall
26:46
know making sure that uh you have fall tolerance in the system by having
26:47
tolerance in the system by having
26:47
tolerance in the system by having multiple instances of critical
26:49
multiple instances of critical
26:49
multiple instances of critical components so the system what happens is
26:52
components so the system what happens is
26:52
components so the system what happens is like even if one fails then it will
26:54
like even if one fails then it will
26:54
like even if one fails then it will automatically fall back to the other
26:55
automatically fall back to the other
26:55
automatically fall back to the other ones so that's how you'll make sure that
26:58
ones so that's how you'll make sure that
26:58
ones so that's how you'll make sure that there is always uh even in the event of
27:00
there is always uh even in the event of
27:00
there is always uh even in the event of failure uh your your system is able to
27:02
failure uh your your system is able to
27:02
failure uh your your system is able to respond to the user request and then you
27:05
respond to the user request and then you
27:05
respond to the user request and then you have failover uh mechanisms which are
27:08
have failover uh mechanisms which are
27:08
have failover uh mechanisms which are like basically uh it will identify in
27:10
like basically uh it will identify in
27:10
like basically uh it will identify in case of the failover right so it will
27:12
case of the failover right so it will
27:12
case of the failover right so it will automatically switch the backup system
27:15
automatically switch the backup system
27:15
automatically switch the backup system so how do you do that with a minimal
27:17
so how do you do that with a minimal
27:17
so how do you do that with a minimal downtime that that's what matters like
27:19
downtime that that's what matters like
27:19
downtime that that's what matters like when you're building distributor systems
27:21
when you're building distributor systems
27:21
when you're building distributor systems and again the monitoring for the
27:23
and again the monitoring for the
27:23
and again the monitoring for the failures this is the key part where uh
27:25
failures this is the key part where uh
27:25
failures this is the key part where uh you have to make sure you you detect
27:27
you have to make sure you you detect
27:27
you have to make sure you you detect this issues notify the the users or the
27:31
this issues notify the the users or the
27:31
this issues notify the the users or the the the admins of your system and then
27:34
the the admins of your system and then
27:34
the the admins of your system and then make sure that you don't cause a
27:35
make sure that you don't cause a
27:35
make sure that you don't cause a significant problem so that's how you
27:38
significant problem so that's how you
27:38
significant problem so that's how you you have to design your systems uh
27:40
you have to design your systems uh
27:40
you have to design your systems uh addressing the fall
27:44
tolerance and yeah coming to the
27:46
tolerance and yeah coming to the
27:46
tolerance and yeah coming to the performance opt optimization right so
27:49
performance opt optimization right so
27:49
performance opt optimization right so yeah so this is also another uh key part
27:51
yeah so this is also another uh key part
27:51
yeah so this is also another uh key part where uh you want to make sure that your
27:53
where uh you want to make sure that your
27:53
where uh you want to make sure that your Cloud uh systems are uh and operating uh
27:56
Cloud uh systems are uh and operating uh
27:56
Cloud uh systems are uh and operating uh efficiently and effectively so the few
27:59
efficiently and effectively so the few
27:59
efficiently and effectively so the few few ways of doing this is basically
28:01
few ways of doing this is basically
28:01
few ways of doing this is basically using the caching as a technique where
28:04
using the caching as a technique where
28:04
using the caching as a technique where you improve the overall uh system
28:06
you improve the overall uh system
28:06
you improve the overall uh system performance by storing the frequently
28:08
performance by storing the frequently
28:08
performance by storing the frequently accessed data like uh you can you can
28:11
accessed data like uh you can you can
28:11
accessed data like uh you can you can make use of redis cache or you know most
28:13
make use of redis cache or you know most
28:14
make use of redis cache or you know most of the databases now have the caching
28:16
of the databases now have the caching
28:16
of the databases now have the caching Avail uh ability in them so you can uh
28:20
Avail uh ability in them so you can uh
28:20
Avail uh ability in them so you can uh make use of them and uh redu the
28:22
make use of them and uh redu the
28:22
make use of them and uh redu the expensive and time consuming uh data
28:26
expensive and time consuming uh data
28:26
expensive and time consuming uh data operations and uh yeah so there's
28:29
operations and uh yeah so there's
28:29
operations and uh yeah so there's something called like uh for even for
28:31
something called like uh for even for
28:31
something called like uh for even for the data access also right so you you
28:33
the data access also right so you you
28:33
the data access also right so you you can make use of some of the techniques
28:35
can make use of some of the techniques
28:35
can make use of some of the techniques like uh indexing prefetching and uh
28:37
like uh indexing prefetching and uh
28:38
like uh indexing prefetching and uh minimizing the network around trips also
28:40
minimizing the network around trips also
28:40
minimizing the network around trips also you don't have to make a call to your
28:42
you don't have to make a call to your
28:42
you don't have to make a call to your database uh every time you make uh you
28:44
database uh every time you make uh you
28:44
database uh every time you make uh you want to retrieve some information so if
28:47
want to retrieve some information so if
28:47
want to retrieve some information so if it's uh non-changing information uh you
28:50
it's uh non-changing information uh you
28:50
it's uh non-changing information uh you can maybe store it in your cash and make
28:52
can maybe store it in your cash and make
28:52
can maybe store it in your cash and make use of that cache every time rather than
28:54
use of that cache every time rather than
28:54
use of that cache every time rather than hitting your uh database so that way you
28:57
hitting your uh database so that way you
28:57
hitting your uh database so that way you can avoid some uh Network around
28:59
can avoid some uh Network around
28:59
can avoid some uh Network around [Music]
29:01
[Music]
29:01
[Music] trips and uh yeah so I think that's a
29:04
trips and uh yeah so I think that's a
29:04
trips and uh yeah so I think that's a wrap
29:05
wrap
29:05
wrap [Music]

Best practices in building Cloud distributed systems | Software Architecture Conference

c-sharpcorner_com

What is New in GPT 5

Top 10 Vibe Coding Tools to Try Now

The Cloud Show with Magnus Mårtensson ft. Mahesh Chand - Ep: 73

Navigating the Role of Software Architecture in the AI Era Amid Economic Volatility

Real-Time Streaming Data in Databricks [End to End Project]

Hacking MySQL for Big Data | Software Architecture Conference

Azure IoT + Power Apps = Perfect match for your 1st IoT Project || Power Platform Virtual Conference

Profile & Monitor Web Apps with Azure Application Insights || Code Quality & Performance Conference

Azure App Service for .NET Developers - MVP Show ft. Daniel Gomez

Connecting the dots with PowerAutomate, MS Graph and other cool stuff!

The RIGHT Way to Write Unit Tests for Domain-Driven Design applications

What’s New in .NET: AI, Aspire & Azure for Cloud-Native Devs

Up next in 10

Best practices in building Cloud distributed systems | Software Architecture Conference

c-sharpcorner_com