Join us on October 31 with Mark Brown on Rockin' The Code World with dotNetDave - a weekly show to learn & live Q&A on .NET and other programming technologies.
AGENDA
• Introduction
• Let's Fix It! (code challenge)
• The Interview: Mark
• Wrap up
GUEST SPEAKER
Mark Brown is a 19 year Microsoft veteran and has worked on a number of products and services including, Windows Mobile, Bing Maps and Microsoft's Web Platform & Tools Team. Mark has been on Azure since 2011 and has worked on Azure Web Apps Service, Redis Cache and Azure Networking.
C# Corner - Global Community for Software and Data Developers
https://www.c-sharpcorner.com
Show More Show Less View Video Transcript
0:30
Thank you
0:59
Thank you
2:00
Hey, welcome everybody to the sixth edition of the Rock in the Cold World with Donette Dave
2:11
I'm David McCarter, and I'm glad you're here. Happy Halloween, everybody. If you celebrate Halloween or doing well, I think Halloween is going to be really different this year
2:21
So stay safe. Don't go out and do something else like with your family or something
2:27
So anyway, I got an awesome show today. I have with us Mark Brown, who's the principal PM manager of Microsoft for Cosmos DB
2:37
So I've known Mark for a couple of years now, and I'm really excited to talk to him
2:43
And before the show, we were chatting. I just found out that we actually grew up in the same area of the country
2:49
So I didn't know that. So cool. So anyway, let's get the show started
2:54
it. Again, welcome. I'm glad you're all here. And, you know, again, real quick, you know
3:04
this show is for you. You know, I want to hear your thoughts. If you like the show, if you have
3:09
something to suggest for the show, if there's something you didn't like on the show, all emails
3:15
will be read by me and we'll make changes to the show as appropriate. So I really want the show to
3:23
be for you. So I need to hear from you. I still don't think I've gotten a single email from
3:29
the audience on the show. So please send me an email. Not right now, after the show
3:37
So every week we give away some prizes and every morning, every Saturday morning, I have to figure
3:44
out how I'm going to do that. And this week was tough. I'm not sure why, but anyway, so like we've
3:53
done on all the shows, we're going to give away a $50 Amazon gift card at the very end
3:58
And it's in C Sharp Corner swag next, which is t-shirt backpacks and all that kind of cool stuff
4:06
So how do you win? Oh, and everybody wins Code Rush from DevExpress. Code Rush is the only
4:13
refactoring tool that I use for Visual Studio. I've never used any other one and I love it
4:17
and now you can love it too by getting your own free copy
4:22
So, but we'll do that at the end. So here's the giveaway for the C Sharp Corner Gifts
4:30
What does DDD mean in programming? Let's see if anybody can come up with that real quick
4:39
I'll give you like 15 seconds. What does DDD mean? I keep forgetting to put my little Jeopardy music in here
4:47
I should do that. Nobody? Nobody knows? Everybody's still asleep? Okay. Well, if someone comes up with it
4:59
you win. Okay. Oh, there we go. Geo, I got it. Domain-driven design. Yes
5:05
you're correct. You get the C-sharp corner gifts. Oh, well, two people at the same time. So
5:13
So Simon's going to have to pick who came in first, either Mark, my friend, or Geo
5:20
So anyway, I found a solution to an issue I've had for a while now in Visual Studio
5:27
and dealing with code ysis in .NET Core. As you all know, I'm really, really big into code ysis, coding standards, code quality
5:39
You know, that's what all my conference talks are wrapped around. And, you know, the code ysis story in Dynac Core, it seems to be very confusing and fragmented, and I've been trying to figure it out
5:52
But one of the things that I did find, which I've been struggling with, is this
6:01
So if you have the Visual Studio version that has yze in it, if you, in Done at Core, it works better in Done at Framework, but in Done at Core, if you do run code ysis on solution, it will not actually run the code ysis correctly
6:19
I know that because I, you know, when I first noticed this, I should have gotten a lot more violations than I was seeing through code ysis
6:28
and I keep running into this issue. So posting something up on
6:36
I think the Visual Studio feedback is where someone told me how to fix this
6:42
So to me, this is a bug because if you run code ysis
6:46
and it doesn't work, it's a bug. But here's how you work around it
6:50
Hopefully the Visual Studio team will work on this next week. I actually have someone on the Visual Studio team
6:57
I'm sorry, two weeks from now, I have someone from the Visual Studio team on, so we can ask her
7:03
But here's how to fix it. So in your project file, this is how I set up in my normal projects
7:12
I set up, I put in these two XML elements in there
7:17
One's called run yzers during live ysis. I always keep that all false
7:22
And then run ysis during build true. I do that for normal projects because I run it I want code ysis to run after each build because you know I tell people all the time that if you fix them then it be a lot easier
7:35
to fix than six months or a year down the road. In unit test projects, I don't really want it to
7:40
run at all. So I turn them all false, but then I was working on it this morning and it's still
7:46
running code ysis on those. So I'm still trying to figure it out. So anyway
7:52
the top one does work for sure. So if you want to run code ysis in .NET Core
8:00
you're going to need to add these two attributes to your project file
8:06
Okay? So I'd like to bring on our guest, Mark Brown
8:16
Oh, I'm sorry, Mark. I forgot to change the name at the very top
8:21
Oh, man. Anyway, let me read Mark's bio. Mark is a 19-year Microsoft veteran and has worked on a number of products and services, including Windows Mobile, Bing Maps, Microsoft's web platform, and tools team
8:37
Mark has been in the Azure since 2011 and worked on the Azure app services, Redis Cash, Azure networking
8:46
Currently, Mark works on the Cosmos DB team. That's where I met him
8:51
And is the PM for the high availability features, including replication, consistency models
8:58
multi-master and conflict resolution. And there's more. I'm going to stop there
9:04
Mark can tell you the rest. So with that, let's bring on Mark. Hey, Dave
9:08
How are you? Can you hear me? Yep. Hey, Mark. Hey there
9:14
Sorry, I totally screwed up your slide there. That's all right. That's what happens when I'm trying to change slides at 8 o'clock in the morning and I'm tired, you know
9:25
Jeremy's a good guy. It's perfectly fine to mix and match my name
9:30
Well, you're kind of on the same team, right? Well, no. No? Okay
9:35
He's over in .NET land in DevDiv. So I work over in Azure
9:41
But we work closely together. I just had a meeting with him just this week, actually
9:48
And we were talking about the Cosmos provider for Entity Framework. And I reached out to him
9:58
I've been meaning to reach out to him for a while because I manage all the developer relations stuff for Cosmos
10:04
So I'm on Twitter all day watching people or just catching questions from folks and Stack and everything
10:11
And I've been seeing over the past year or two kind of an increasing number of questions about Cosmos using Entity Framework
10:21
And so I just reached out and I said, look, it looks like, you know, more and more people are using it
10:27
There's some gaps there. Obviously, there's some things we could do to make the experience better
10:32
What can we do or what should we do? until I talked to Jeremy, I didn't know you had a
10:39
an a framework provider for a Cosmos DB. Cause I, I remember
10:43
I think when I was getting into Cosmos DB, there wasn't one. So, um
10:47
there is one now. So I'm looking forward to using that, you know, when I get back to my project. Yeah. You know, it's interesting. Um
10:56
using an orange for a no SQL database is kind of, kind of strange at some level. Yeah. Yeah
11:02
Because, you know, you don't really need one, actually. Right. You can just take your Pocos and serialize them and write them straight into Cosmos and then, of course, the reverse
11:14
Right. But people are obviously very familiar with it, which is why there's so much usage
11:21
And so people want to, you know, they don't want to have to relearn everything or learn new stuff
11:25
and they want to get their projects done and get home at the end of the day, I guess. Yeah, that's true
11:30
I think the biggest challenge I see is people think they can just kind of rip and replace the database on the back end and not have anything
11:41
And that's not that's not how it works. This is a it's a fundamentally different database
11:49
It's not a relational database. It's a NoSQL database. And just things are not the same
11:55
You cannot just change your connection string and then just start writing Cosmos with that
12:03
So anyway, it's not just with EF. I think generally speaking, it's one of the biggest challenges we see is educating users as to how to design and model data for this type of database
12:20
Because it's different. With a SQL server, a relational database, that's a scale-up type of database, right
12:28
You want better performance? You put it on increasingly bigger VMs with more CPU, more memory, more storage, all that stuff
12:37
With us, we're the opposite. We're scale-up. So you don't put it on bigger VMs
12:42
You just keep adding VMs or machines or compute, right? The challenge there is it's a different computer
12:49
So you have to shard or partition your data. And that's where people start getting really confused
12:56
They don't understand this notion of sharding data. And so that's kind of where we start with is trying to help users, developers primarily, understand how to design for this type of database and how to get performance
13:12
I mean, the idea and the promise of a distributed NoSQL database is unlimited scale
13:17
because you can keep adding computers to it, and you're kind of gated, you're limited with a relational database
13:24
as to how big it can grow. You're unlimited. You can just keep adding VMs or keep adding machines to it
13:31
The challenge is you've got to be able to design it such that data is evenly distributed on your rights
13:37
to utilize all those VMs, because if you don't, you're basically paying for compute you don't need
13:45
On the third side, when you're reading data, you want to limit how many computers you're talking to and use hopefully
13:52
ideally just a single VM or computer to talk to because you're going to get
13:57
better performance. Yeah. So anyway, yeah. Yeah. That's, you know, to kind of catch everybody up on how, you know
14:06
I got the no Mark is, you know, two years ago, I was trying to learn Cosmos DB for the first time in this project that I'm
14:15
working on, which I also turned into a conference talk. And that was when I was running into
14:22
problems because I think like a lot of other developers, you know, I grew up in the relational
14:28
database world and Cosmos DB is just, I have to forget everything pretty much. I learned in the
14:34
relational database world and start over again. And I am, you know, even though I've been using
14:39
in Cosmos DB for two years now, not in production, just my own projects, I am still lost
14:47
And with that, I haven't spent a ton of time in the last two years trying to learn things
14:53
because I been so busy but I need a lot of training And I think maybe the younger millennial developers maybe not so much but us older developers definitely I think need a lot of handholding and training and also training from okay here how we did it in the you know relational world
15:14
Here's how you do it in the document DB world, you know, and, and, and, and even, you know, and we'll probably get into talking about this and, you know, even getting into, you know, we've been beaten into our heads over and over again, don't duplicate data in a database
15:31
Right. And then here comes Mark Brown saying duplicate data in the database, you know, and so it's, it's, it's going to take a while for us older one, older developers, I think, to change our mind
15:44
I guess it's high levels of cognitive dissonance, I suppose, right? Yeah, I came up to the same world as you did, right
15:54
I've been in this business since the early 90s. I mean, I think, you know, there's a Sybase or early SQL user back then
16:05
You do have to unlearn quite a bit of stuff, right? Yes. The way you think about it is in a relational world, right
16:12
Relational database is what? Designed in 1970, right? Yeah. In the 70s, storage was really expensive
16:20
I think like a megabyte or a few megabytes of data was like a couple hundred thousand dollars
16:26
or something like that. So relative to compute, storage is very expensive
16:32
It made sense to normalize the heck out of data and reduce duplicate data because as a percentage of costs
16:39
storage is very expensive. These days, I mean, I got this phone
16:43
I got a terabyte of storage on this thing. Yeah, no. So as a percentage of compute, storage is cheap now
16:53
So it doesn't make sense. Normalizing data doesn't make sense. The other thing too is to consider is
16:59
with this type of database, because as a percentage of compute storage is so cheap
17:05
you want to optimize around the request, right? So you want to serve your data with as few round trips to the server as possible
17:13
Right. So that means you want to embed data, right, or denormalize the data so that you get it in a single request rather than having to do multiple requests across machines
17:23
The other thing, too, is, you know, this type of database didn't just come about for no reason
17:28
I mean, there was there is a reason for it. And the reason is because of things like cloud scale
17:34
right companies like facebook and twitter uh and others that are just dealing with uh enormous
17:43
amounts of load and request um you know relational database just can't meet that many customers or
17:49
that many users also consider that these users are now located all over the planet yeah it's not a
17:56
it's not a bunch of people sitting in a single building with a data center attached to it
18:00
they're all on the planet. So you need to have a distributed database as well
18:05
You know, SQL just doesn't, it just doesn't work in those types of areas, right
18:10
Which is why, you know, you have these new types of databases now
18:13
things like Cassandra or Mongo and certainly Cosmos that are designed to work in as partition stores
18:21
and also be distributed as well as data. So, yeah, I love it, man
18:27
It's great. I mean, yeah, it's, does it, there's a bit of a learning curve
18:32
But once you get it and you can figure out how to design and model data around
18:37
your access patterns rather than by storage by entities, you know, you can get some amazing scale out of it
18:45
Yeah. Yeah. Once I figure out how to do that correctly, because you've even schooled me this year, Dave, you're doing it wrong
18:52
I said, Mark, I got to get the article out. So I'll do another one on how to fix what I did in the first one
19:01
You know, kind of like here's how I did it as a relational database guy
19:06
and here's how you really should do it. You know, so I'm still going to do that, Mark
19:10
Don't worry. But we're getting some questions already. Simon, do you want to put the first one up
19:22
Hey, Sean, how are you, buddy? What are the benefits of using entity framework with Cosmos
19:32
The benefit I guess you would get is if you're already familiar with entity
19:35
framework, then you don't have to learn to use our SDK to do it
19:42
Let's, let me put it this way. If you were, it's not, if you're trying to squeeze a lot of performance out of your solution
19:49
because you have to go through an ORM that essentially is wrapping our SDK
19:55
you're not getting the best performance you can out of that. So if what you're looking for is super high performance
20:04
you're going to want to use either our .NET or our Java SDK
20:09
Although I shouldn't say Java on this show. But one of those two
20:14
This is all about programming, not .NET. Not .NET, okay. It just happens to all my friends are .NET people
20:20
Same here. but you're going to want to use our native SDKs to do that
20:27
But that said, there's, you know, quite a few people use the entity framework because, you know
20:33
it's just they want to get to market quicker with the work you're doing and they're familiar with it. And I totally get that. I totally, totally get that
20:40
So, which is why I met up with Jeremy and his, his dev team this week and we talked about it and we said
20:47
what should we do about this? So we're going to look at it and see, you know, can we make that experience a little better and help take some of the rough edges off that
20:57
Because there are some rough edges on that. Yeah. Yeah. I think when I first was looking at Cosmos DB, I wanted any framework because I just didn't understand Cosmos DB
21:10
And I think I was hoping that if I could just do it through any framework, I could, you know, learn it maybe a little bit faster
21:17
but it wasn't around when I started. We have more questions. Simon, do you want to put up another one
21:27
Yeah, is there a process? I see this all the time too. You know what
21:33
Yes, there is a process, but what it really involves is understanding the access patterns
21:41
for your application. We have a great article. is there a way I can share
21:47
yeah you can share Simon can put your screen up there's a share down at the bottom
21:54
I was just going to pop it into can I pop it into the chat
21:59
oh yeah you can oh yeah he can do it in the private chat
22:04
you can do it in the private chat then Simon will post it on the public one
22:08
I just put a link to an article we wrote this article about a year ago
22:14
a little over a year ago because we knew customers were struggling with how to take a relational
22:19
database or that kind of scenario and design a scalable, um database with Cosmos DB Uh so the scenario here is like a blog platform right Which everybody has done a you know a blog their own custom blog I think well most people have but
22:38
using like a relational database. So you've got posts and comments and users, and they're
22:44
all in different tables. And of course, they've got relational or foreign keys into each of
22:49
these things because a user, you create a new user, and it can be an author and they create
22:53
a new post, and then each post can have one or more comments. And so that's kind of your relational model
22:59
So how would you design such a thing? So anyway, this article walks you through the process of understanding how would I design
23:06
this or migrate it into a NoSQL model here? And the key is understanding, all right, what operations am I doing on my database, right
23:15
So I've got to create a user. I'm going to create a post
23:20
I'm going to add a comment or comments. If I'm an author, I want to see my post
23:26
So I'm going to do a query by user ID on the homepage of my blog platform
23:31
I want to see like a like a bunch of posts in there. Right
23:36
And then the count of comments. Right. So those are that these things I'm describing are kind of the steps for, OK, how am I querying the database
23:46
because that's going to drive the design I come up with and the model that I'm going to use
23:52
And also, how do I partition my data? Because those queries that I run
23:56
I want to be able to answer with a single partition query where I can or at least a bounded partition
24:02
So like with a partition key, like an end statement or something like that
24:07
So what I want to avoid is hitting all the partitions. Now for a smaller database
24:14
cross-partition queries are fine. And generally, in some scenarios, they're okay once in a while
24:21
But the problem is as the database gets larger, cross-partition grades get increasingly more expensive
24:26
So that key is important. What I would say overall is look at the process as being iterative, right
24:34
You want to test it, measure your RU, your throughput charge, and then see, hey, can I make that any better in there
24:42
So, but I think once you kind of get kind of the hang of it and start realizing kind of how to frame it in your mind, it becomes easier
24:51
But you're building a new skill when you're doing this. It's a new skill
24:56
And there's no tools yet that make this easier. It's just a matter of kind of building out that muscle
25:06
Yeah. So everybody knows what we're talking about. Can you just quickly explain what a partition key is for those people who don't know what it is
25:16
A partition key is essentially kind of like provides a way of routing data
25:22
Think of it this way. If I have a, let's say I'm a pizza delivery guy and I've got to go and deliver a pizza
25:30
And I'm driving down a street and there's four apartment buildings, each the same size
25:36
They're all 10 stories tall, 10 units to each floor. And the only thing I know is I have to take it to unit 202
25:45
But I don't know which apartment building I'm going to. So if I write a query that does not include the partition key, I'm going to go into building A, building B, building C, building D
25:57
That takes a lot of effort and a lot of, if you will. The partition key tells me which building to go to
26:03
So let's just say it's building B unit 202. Now I know exactly which building to go to and to go deliver my pizza
26:11
Does that make sense? So when I do request whether I'm writing data, well, if I'm writing data, you've got to include it
26:17
But if I'm reading data, I want to include that partition key because it essentially provides a route to where my data is, to what machine my data sits on
26:29
And then the unit number is kind of like the index. It's the ID. So the other thing to keep in mind too, and I see a lot of users doing this, if you're retrieving a single row of data, don't use a query
26:42
Use what we call it's a point read. It's basically calling our read item async in our SDK
26:49
And then with that, you just pass the partition key and the ID, and then we'll return you that data
26:54
It's the fastest operation. It's the fastest way, cheapest way to read data
26:58
No, I didn't know that. Well, maybe I did. I have to go back and look at my code. I don't know
27:02
If you're writing a query where you're passing the ID, you can do that as a point read
27:10
It'll save you a ton of RU. Yeah, I do that in my app
27:14
And I'm actually in beta testing of the new version of my app, which uses Cosmos DB and all that Azure stuff
27:25
And so I'm testing it right now. All right. Simon, do you want to put up next question
27:32
We're getting a bunch of questions, which is good. I'm happy. What's the network protocol for Cosmos DB
27:38
There's two ways you can connect. You can connect over HTTPS. That goes over our gateway
27:45
However, in our newer v3.net SDK, and also I think in our newer Java v4 SDK
27:54
you connect over TCP. And actually, so you bypass our gateway, you go directly into our backend
28:02
Right. So we're all of the we're all of your data sets. So all of the all the partitions where your data sets
28:09
And what we do is when you initially connect and do the handshake, we actually will pass back a bunch of routing data into the client
28:19
That basically tells you it basically gives you partition key ranges for your data so it knows how to route
28:25
So if you give a partition key value and say a point read or a query, we know where to route that data to the physical machine in our backend
28:35
So that's got to be a heck of a lot faster, right? It's way fast
28:40
It's much, much faster. Going through a gateway is just basically like making another hop
28:44
Right, right. The gateway essentially does all that routing. That was the older way we had it was with our gateway
28:52
now it's basically direct TCP so are you going to get rid of the old gateway
28:58
or just keep TCP we have to keep it because we have lots of customers that connect
29:04
using REST API JavaScript we have a Node.js SDK which uses our gateway
29:12
and a Python SDK which uses our gateway so what you're saying
29:18
to me sounds like for example my app that all lives in Azure
29:24
it's better to use the TCP method than the gateway. Yeah, you're definitely going to get much, much better performance
29:32
If you don't have to do that extra hop. Yeah. Yeah, definitely. It's going directly to our backend
29:38
We also have another wire protocol that you may see if you're looking at the diagnostic logs. it's RNTBD
29:49
protocol which is basically just something we made up it's our own custom wire
29:55
protocol you can't you don't talk to it but it's basically how
29:59
how we talk to our backend in there. And actually it means real name to be determined
30:06
That's the name of the protocol. So, and we never came up with a name for it. Well, now I need to do more modifications to my app
30:13
So Simon, next question. Serverless offer for the core API is amazing
30:23
Will it be available for table API? Yes, we'll do serverless for all of the APIs at some point
30:30
That's our plan. I know for people that are used to the legacy table storage, I guess the storage team is
30:43
encouraging people to use our table API. We have wired protocol support for table API, for table storage in Cosmos
30:53
But yeah, that'll be good. And then we just obviously work on improving the cost for storage as well. Yeah
31:03
So I'd like to go back a little bit to a question we had before
31:09
And everybody else out there, please send in your questions. I think we've run out
31:13
So Mark's here. This is your opportunity to ask him live and in person
31:18
So please send your questions. I'd like to go back a little bit to a question before about migrating
31:27
And do you, like, maybe recommend to your customers to maybe not migrate the whole freaking database at one time
31:40
and just start moving over chunks, you know? Like, kind of like we do with programming, you know, we just move chunks of programming to the new .NET core
31:49
until we get everything moved over. you know do you kind of see that too for the migration or with our larger customers that are
31:57
running like pretty hefty size workloads uh it's the only way you can do it is to kind of
32:03
uh segment the app in such a way that you can migrate pieces of the workload over
32:11
we are increasingly seeing customers running very large kind of workloads, relational workloads, and they're trying to move them into the cloud and deciding
32:26
that they need to get better scale out of the thing. And like I said, at some point, you kind of hit a ceiling with relational data
32:35
This is MySQL, SQL Server, Oracle, whatever. It doesn't matter. Postgres. and as part of that modernization, if you will
32:45
they're looking at evaluating and in many cases actually doing migrations to Cosmos
32:52
because they need that better scale. Now that's an insanely difficult bit of work to do
33:00
because everything is changing in your backend, right? All the code that's touching the database is changing
33:07
So it's quite a bit of work. to do that. But we see customers that are doing it
33:14
and being successful with it. You also need to kind of keep it live at the same time
33:18
So we see customers using like a cockpit or something like that to basically keep the two systems in sync
33:23
until they're ready to kind of pull the switch. Yeah. So yeah, not for the faint of heart
33:31
Yeah, I bet. I mean, just like any kind of live migration
33:36
it's very, very, very. Well, you know, Jeremy was talking last week about GraphQL
33:44
Could GraphQL maybe be used to help with that or no? I'm not really sure
33:50
You know, I don't know a heck of a lot about GraphQL. Yeah, I don't either, actually
33:55
Oh, okay. Yeah. All right. I think we have another question, which I think is actually a good one
34:01
What are the limitations of Cosmos DB emulator compared to the Cosmos DB
34:05
Yeah, so because the emulator only runs on a single machine, your computer, there's no replication, obviously, or consistency models
34:16
Yeah. Also, it's just today only works on Windows. But I can say that we are working on porting that and providing a version of that that will run on Linux or Mac
34:34
We've been actually at that for quite a bit of time. I manage our user voice as well, by the way
34:39
So if you tracked our things up there, I've had that as a started for probably two years
34:52
We were actually working on it and then had to defer the work. You know how things go
34:58
Priorities change. Yeah. Just you got to pull people around and put them on different projects or work or whatever
35:06
like that. So we kind of started and stopped that work a few times, but we're making a push to get
35:13
to get it done in there. So that's another limitation as well
35:16
Is it only today just works on windows, but we'll get that fixed
35:21
I think someone just after that asked, is there a way to run a local instance for testing
35:28
and I think that's what we basically provide a container for it
35:32
Oh, do you? Yeah. Let me see if I can find the good thing I've got our docs up here
35:42
Let me see if I can find where the heck is this thing
35:47
We just, just last week changed. We had what we call a docathon
35:53
So we've got the whole PM team to go in and update all of our docs
35:58
And then we changed all of our, I don't know where this thing is now
36:05
Oh, here we go. No, that's not it either. Oh my God
36:11
I'll find it somewhere. Oh, here it is. Develop locally. We moved all of our docs
36:19
So we changed the table of contents in there. Here, let me docker
36:27
Here, I'll throw this in the chat window as well. So, yeah, you can just do a docker pull to this URL, and then you can basically stand
36:40
it up. Cool. So, yeah. You know, this reminds me of something I might have asked you a long time ago when we first
36:49
started chatting. And the reason I ask this is because, you know, one of the things I talk about a lot when I'm, you know, speaking at conferences and I write is the importance of, you know, doing local data caching
37:05
Right. And because kind of like what you were saying a little bit ago that, you know, one of the things I caution anybody I can talk to is don't go across the wire unless you have to
37:15
Right. Because that's the biggest performance killer of your app is going across the Internet
37:19
so be and so because of that you know I recommend people the cats as much stuff locally as they can so they don have to go across the wire So do you know if there any work to maybe have like you know we have we have SQLite that works on all platforms
37:38
Right. Is there is there maybe a plan in the future to do like a Cosmos DB light just to do that kind of thing, you know, on the client side
37:48
Nothing specifically for the client. We've been looking at that for quite a bit. There's
37:57
nothing to do with kind of having a local client, kind of like a SQL lite version. But
38:05
we are, we are looking at it. We're more looking at kind of edge scenarios, but more targeted
38:12
around kind of an IoT edge kind of scenario. Yeah. What I can say is, I mean, I totally agree with you with, you know
38:21
look at going over the network is the biggest performance killer. I mean, we are a distributed database, right
38:28
So we're in every Azure region. So you can replicate your data as close as possible to where your users are going
38:36
to be to help reduce that latency request, right? Now, that's great for reads
38:44
When you're running in, let's say, a single master mode where you've got a single primary region where you're writing to and then you replicate and then all your reads happen more locally, I guess you can say
38:56
For scenarios where you need low latency for reads and writes, you can use our multi-master or multi-region writes, we call it now, where you can turn every replica into a writable replica
39:09
and then replicate that data back to the primary region and then merge it and then do any conflict resolution
39:17
if there is a conflict and then from there replicate back out
39:25
I'm sorry, go ahead. We have one more thing that we're working on too to help
39:29
not specifically with locality, but with caching, if you will
39:39
So we're working on a cache, actually, for Cosmos. One that will be local, too
39:47
It won't be local. It'll be within the region. But the idea here is if you've got data that's frequently read
39:57
rather than having to query it every time, cache it. And that's from the cache
40:02
So it'll operate just like a cache. But the idea is it'll help with read heavy scenarios
40:07
particularly ones where you've got very heavy queries that are expensive to run
40:12
and slow. You can cache results, I think, and then just read from the cache. So that's, you know, how to
40:18
Yeah. You know, the other reason I was asking that question is, you know, not only for, you know, not going across the wire, of course, but, you know
40:25
something else I talk about that I don't think I really hear anybody talk about
40:30
And that's, you know, a concept of occasionally connected app, you know, and, you know, in a sense, almost every app is occasionally connected, right
40:41
Because the Internet could go up and down. Right. And so but but even beyond that, you know, let's say you have an application that, you know, you want to work online and offline
40:55
Right. And so, you know, having a local, you know, Cosmos DB store would help with that
41:03
And, you know, when you were just talking, you know, it made me think back to a while ago, Microsoft
41:10
And I don't know why Microsoft did this. If someone could tell me a Microsoft, I'd love to know
41:16
But Microsoft used to have a really decent framework, you know, to do this called the Microsoft Sync framework
41:24
And they abandoned that project for some reason and with no replacement
41:30
And ever since that time, I've been wanting a replacement to that because that solved
41:37
I mean, it wasn't the most elegant thing in the world if you tried to use it, but it did solve some issues we've had in it
41:45
And I think you guys wrote it for yourself because I think there were some apps at Microsoft that used to use it
41:52
Right. And so that was for apps, mobile apps that were being loaded into the store
41:59
The Microsoft store, I think, is where that that whole framework was
42:03
You know, it's we learned about that a couple of years ago and then started having some conversations with them
42:11
Like, hey, how can we make this better? It's actually worked really well. They have a good key value store for Cosmos and it would do the sync and everything in there
42:18
They did all the work. We didn't even realize they had built this thing
42:23
And then they killed it. Yeah, I know. I know. Insert foot in mouth or whatever
42:36
That's going to be my next user voice is bring back the Microsoft Sing framework
42:41
You should see my ever-growing list in that thing. I have a huge list
42:46
I think the last time I looked at it, I had like 28 items or something in there
42:51
Yeah, I'm sorry. It wasn't my decision. Yeah. But, yeah, I kind of wish they would have kept going with it, you know
43:00
because we really need that. And, you know, quite frankly, you know
43:05
the reason developers don't do that is because it's too hard. It is
43:10
I just had a customer the other day. He's writing an Android app using us and was talking to him
43:16
And he's talking about what's available out there for doing this. And I said, he's writing it for Android
43:23
I'm like, you basically got to write your own kind of stuff. I said, I'm sorry. Yeah
43:28
And the sync framework took care of the biggest issues with that whole way of doing things
43:36
Anybody out there wants to go and start their own company. I know
43:41
Go build a sync framework for SQLite. Right. Well, hey, you know, we had the startup company conference yesterday. If Mahesh is listening, Mahesh, here's a new idea for you to be you and me to work on. I'm ready
43:58
I'll promote you on our Twitter handle and wherever. Make it get you in as a partner. So, yeah
44:06
Yeah. So any more questions, guys? Come on. We still have about 10 minutes left. So
44:11
By the way, I wanted to share another thing. I'll pop this into the chat
44:17
There's an MVP by this guy named Lenny, Lenny Lowell. He's awesome
44:23
And he creates, he does two, I think three Pluralsight courses on Cosmos
44:28
He has one around data modeling and partitioning that's just fantastic. What he did is he took AdventureWorks 2017 database, which most of us are pretty familiar with, right
44:38
He took that. He took that database and walked through the steps on how to model it and migrate it from
44:46
SQL Server to Cosmos. The course is about two hours, a little less
44:53
It comes with all the source code he used to migrate the thing and walks you through it It one of the best courses if you will or pieces of learning for how to design and model
45:08
and migrate a relational database. Hey, there's your buddy Mahesh. Hey, Mahesh
45:16
There you go, man. He's listening. They're great. At least somebody listens to my show
45:21
If you're a SQL server dev, SQL head, and you want to learn how to model for this type of data store
45:28
or migrate an existing app, I definitely recommend. You can actually, if you don't have a portal site
45:34
I think you can just sign up for like a free trial. Yeah, you get a seven-day free trial, I think
45:39
And then they're always doing like free weeks and stuff like that every once in a while
45:46
They've seemed to be doing more of those during the pandemic, of course. But they just had one like a week ago
45:52
so that's you know we have a two-day workshop they have two-day workshop and last year I probably
46:02
did I don't know a dozen or more deliveries of that thing I flew well over 100,000 miles last
46:08
year Europe and Asia whatever I do I do a bunch of public speaking already or did
46:13
as much as year I've got some I've got some coming up though virtual but everything's gone
46:18
virtual now. So one of the things I'm looking at doing is taking and porting this workshop
46:22
into an online format. It's just, it's a challenge. Yeah. Customers still need to learn. They still
46:31
need workshops, but you've got to, you know, this is a new world with this pandemic and how you
46:38
how you deliver those things is fundamentally changed. Yes. Yes. So challenging, challenging
46:45
days working from your office from home. Well, I hope you don't lose your airline miles like Jeff
46:52
Fritz is losing his. You know, Jeff flies around too. And he just got a notice a couple of weeks
46:58
ago that American Airlines is not going to keep his miles because no one can fly. Come on, American
47:06
That's not cool. Delta said they're going to give you another year. So I'll keep my diamond status
47:15
for another year. I think if I remember correctly, he said that that runs out with American
47:21
at the end of the year, I think. Right. Yeah. And so right now
47:26
having said they're going to renew it, I guess, but they should. You know what I do is most airline programs
47:33
will allow you to transfer over and keep your status. Oh. So if he's, I don't know what American is
47:40
but whatever the equivalent of diamond is for American, If he's that level, he can quit American and move to Delta
47:48
Of course, all the miles will still be, I think, stuck on there, but you get the new status
47:53
Yeah. Well, my two favorite airlines in the U.S. at least are Delta and American
47:59
So American, get your act together if you're listening. For me, it's tough
48:05
Everything should be extended until a year after the pandemic is over
48:09
Okay? Agreed. Put it out there. Although I'm sure airlines are hurting financially
48:16
Oh, I know, but it doesn't cost them to keep it really. Yeah
48:21
Yeah. And if they want to bring people back, you know, that'd be a better way to bring people back
48:28
Alienate them. You want them to come back. Right. So. Yeah. Let's see
48:37
What else can we talk about? We can talk about what it's like being an old software engineer
48:41
I got some. yeah let's do that i think behest we've already kind of talked about
48:45
that in beginning of the show behest were you at the beginning of the show
48:49
or you did did you just drop in oh yeah we talked about that at the beginning yeah
48:59
yeah another cool thing we can talk about too is uh and that's also a new concept to folks
49:04
is cap theorem particularly if you're not familiar with distributed systems there's and
49:12
then i because i focus on the replication high availability features consistency models this
49:18
is something i spent a lot of time thinking about and delivering talks on is understanding cap
49:23
theorem and paclc theorem which are kind of fundamental to understanding the performance
49:30
and availability characteristics for applications that use a distributed data store. Because unlike, say, a relational or a SQL database where you've got a single VM sitting
49:41
in a single data center, distributed databases are in different data centers and distributed by
49:47
hundreds or maybe thousands of miles. And the distance involved there has some real-world
49:55
effects to how your applications work. You need to provide guarantees for how data replicates
50:03
That's what consistency models are used for. And you can have, there's different levels. You can
50:08
have strong consistency at one end of the spectrum. And basically what that means is anytime I write
50:14
data, I have to replicate it and commit it in every replica before I can act it back, right
50:20
That's great. And basically that's kind of the consistency that people are used to
50:24
in a relational world is strong consistency. Like I want to write data, I want to read it back immediately
50:30
But it's actually very challenging in a distributed database world because the latency involved in doing that can kill your apps
50:40
Yeah. Like if I have two databases in, say, West US and East US
50:46
that latency one way is about 60, average 60 milliseconds, 120 millisecond round trip
50:53
that can, for latency sensitive applications, that can really impact your application
50:59
If I'm, say, going from West US to, say, Singapore, I could be looking at upwards of 600 milliseconds, maybe
51:07
something more. And here's the kicker is you're now taking a dependency
51:13
on a network, on a WAN. WANs are notoriously flaky. They always go down, right
51:20
I mean, just think, if you've tried to do something where you have to have real-time, your real-time communication over a wide area
51:26
network, all bets are off. You don't know what can happen. And believe me, everything can happen
51:33
to a WAN. You know, some guy with a backhoe could dig into the ground and knock out a metro circuit
51:42
into your data center, bring the whole thing down. That's exactly happened at a company I worked at
51:48
No, we can't, that exactly happened. We moved all of our servers from San Diego to a co-location in Texas
51:57
I forget the name of the company, but they're a really big one. And even before we moved it, I asked them on the phone, I asked
52:06
do you have multiple T1 lines coming into your building? They said, yes
52:12
And I go, okay, right? Well, they didn't tell us. They were all in the same trench
52:18
Right And so we come in first thing in the morning and some guy with a backhoe sliced all the lines into the data center There you go We were dead you know And so oh yeah Even that stuff happens every day
52:37
I mean, I work for the networking team, Azure networking team, and saw this every day
52:41
We, you know, there's a lot of fiber optics that go through the Suez C that connects Europe and Asia
52:48
There's ships that just park out there waiting for their turn to go through, and they drop anchor
52:53
And I can tell you, more than one occasion, somebody in a boat drops an anchor and cuts a fiber optic cable that connects Europe and Asia
53:01
That stuff happens. Yeah. You just don't know. There's no such thing as 100% uptime, right
53:06
There is none, right? And so you've got to deal with flaky networks and change availability and
53:14
also the consistency of your data. So there's a trade-off is really what CAP and PAC-LC theorem
53:18
is all about is you either can have consistent data, but you're going to get it at the expense
53:24
of latency and performance. Or you can give up that consistency and get better performance because you don't need
53:32
to guarantee that data is 100% consistent. And then there's availability too
53:36
So if you lose a replica and you're guaranteeing strong consistency, you cannot take any more rights
53:44
so you could lose, let's say that network, that bucket, that backhoe knocks out your circuit to your data center
53:52
If I can't replicate data, I can't take any more writes because I'm supposed to guarantee consistency
53:56
So anyway, that's another thing that's just, when you're building distributed applications
54:03
or distributed applications with a distributed database, this is another consideration you have to take into effect
54:11
is this impact on the distance between these, where your replicas are located
54:15
and how your applications perform. Yeah. But it's a good thing to do
54:20
because if your data and your apps are in multiple locations, then you don't have to worry as much
54:30
about if a single data center goes down. Right, exactly. Which happens, by the way
54:35
Yeah, sure it does. So you can get and typically enjoy better availability
54:41
And for customers using Cosmos that are replicated into two or more regions, we give them an extra nine of availability in our SLA
54:50
So you go from four nines to five nines, which, by the way, there's no other database in Azure has five nines
54:56
Wow. So just because of the way we. Yeah. Well, yeah, I think we got to wrap it up now
55:05
OK. We're almost out of time. I can't believe the hour is almost over
55:11
But I really enjoyed our talk. I'm glad we finally got to talk virtually, finally, not through emails
55:19
Or Twitter. Yeah, or Twitter. And, you know, I really – I learned a lot
55:25
And, you know, I definitely want to have you back on, you know, maybe next year, you know, after everything settles down here in America
55:33
Yeah, we'll still be sitting. We're right where we're sitting, I expect. maybe you can move down here to
55:39
Southern California again and we can hang out more in person but I have two
55:44
final questions for you real quick before you go same questions I ask everybody
55:51
so what do you like to do for fun that's not programming that's a great question
55:59
like in your free time I like to try different bourbons and whiskeys
56:05
No, that's a good one. Yeah. Yeah. I like to go to Costco
56:11
Do they have, you know, up there where you live, do they have like distilleries there that make them there
56:18
You could go try them and stuff. We have some here. They do have some local distilleries around here
56:23
They're not bad. I think generally speaking, most of the good stuff is going to be from the east
56:31
Yeah. And lastly, do you have anything you want to plug before we go
56:39
No, I mean, just, you know, if I already plugged stuff like on our docs about modeling and partitioning data and I plugged Lenny's course, you know, we have a site called Got Cosmos
56:54
It's just a little .NET site I have up there. It's just basically a link aggregator with a bunch of stuff
57:00
Check that out, too. links to stuff. And that's a good site
57:05
to check out. Yeah, I mean, we're on Twitter at Azure Cosmos DB
57:14
I monitor that all the time. I keep an eye on Stack Overflow
57:19
for people that got questions, and I try to answer questions as much as I can
57:23
There's another Microsoft Q&A site just launched a few months ago. People asking questions
57:29
on there as well. That's monitored by a bunch of folks in Azure and us as well
57:37
Yeah. I mean, go give it a try. We've got a free tier. You can try it out
57:43
400RU, 5 gig of data. We have a serverless tier now, so pure
57:49
consumption. You can go write serverless functions app with a serverless database, pure
57:55
consumption mode. Lots of great stuff coming out of us. end of this year, next year
58:03
So keep an eye on us. The hardest part is the product is changing
58:07
and evolving so rapidly. It's hard to keep up. There's lots of stuff to always update with us
58:18
So keep an eye on us. Yeah, that's kind of my biggest thing about Azure
58:24
is it changes so quick. Everything's out of date like the day after it comes out
58:28
you know, it's the biggest lament, but Hey, why would you get into software if you didn't want to work in an industry
58:34
that moved fast? Right. But you don't mean you and I are getting older
58:40
And so, you know, it's a, it's a, it's harder to learn new tricks
58:45
you know? Well, gotta do it. It's all good. Yep. Well, thank you very much
58:51
And I can't wait to have you back on the show again. And I hope I didn't cause you to drink any more bourbon or whiskey than you
58:56
had to with my questions. Park on water this whole show. No, I mean through our email questions and stuff
59:02
That's what I'm saying. Oh, no, no, no. It's okay. Okay. All right
59:06
Thank you. We'll see you next time. Thanks. Thanks. All right. Can you – yep, there's my slide
59:14
So that was an awesome interview with Mark Brown. I really liked the interview
59:19
And it was great to finally talk to Mark in person and virtually
59:24
And if you want him back, send me an email. and I'll make sure I put them on the list for maybe next year
59:31
Okay. So if we have one more giveaway, and this is for the $50 Amazon gift card
59:38
I struggle with this one today. So let's see how you guys like it
59:43
First one answers wins a gift card. Cosmos DB is what type of database
59:54
There's like two kind of answers I'll accept. I have already given it to Simon
1:00:00
and Simon will pick the winners. So please type in your answer
1:00:08
if you know what type of database. I don't need a big, long sentence
1:00:12
just a quick, you know, five, 10 words. NoSQL, there we go
1:00:18
Whoever answered NoSQL first wins, Simon, okay? So I would have accepted NoSQL database
1:00:25
or document-based databases, usually the way I talk about it. All right. So and of course, like every week, everybody wins a free copy of CodeRush by DevExpress
1:00:39
If you haven't gotten your copy, please go do it. I don't know why you haven't
1:00:45
CodeRush is the only code refactoring tool that I use. And and I recommend it for anybody listening to
1:00:54
So there's where you get it on Devexpress.com slash Donat Dave. And so please go do that
1:00:59
You can do that anytime you want. Okay? So thanks again for watching another show And next week I have someone I known for a long time actually and someone I actually really admire the way he thinks about user experience
1:01:17
And that's Mark Miller, Chief Architect, IDE Tools at Developer Express. So I'm really looking forward to catching up with Mark
1:01:24
I haven't seen Mark in a while, actually. I think the last time I actually saw him in person maybe was the very first build conference
1:01:33
So that'll be great to catch up with Mark. I like Mark
1:01:37
He's a good guy. And please, please, please be careful with the COVID pandemic
1:01:44
America just hit our record high yesterday, almost 100,000 infections in one day
1:01:52
So please, please be careful out there. Things are not looking good for us And I know they not looking great for India either So please be careful out there And something I going to do tomorrow that I recommend all of you to do is please donate blood at your local blood bank They really really need I was talking to someone from the blood bank last week and they are dire need of your help So giving blood is easy
1:02:21
It's free. It doesn't cost anything, but a little bit of your time. And I've been doing it ever since
1:02:28
the 1980s and I will never stop as long as I'm able to. So with that, thanks for watching and
1:02:36
please send me your suggestions and emails to donna dave at live.com and i'll see you next week
1:03:06
Thank you
1:03:44
Thank you
#Windows & .NET


