GenAI Code Transparency—why it is important to know how much GenAI Code coders are using
0 views
Aug 6, 2025
👉 Code Quality Conference 2024 https://codequalityconf.com/ 📺 CSharp TV - Dev Streaming Destination http://csharp.tv 🌎 C# Corner - Community of Software and Data Developers https://www.c-sharpcorner.com/ #CSharpTV #csharpcorner #CodeQualityConf
View Video Transcript
0:03
well I am so honored and delighted to
0:05
well I am so honored and delighted to
0:05
well I am so honored and delighted to speak with you today um on a topic that
0:09
speak with you today um on a topic that
0:09
speak with you today um on a topic that is uh is very near and dear to me and if
0:12
is uh is very near and dear to me and if
0:12
is uh is very near and dear to me and if I if I heard it correctly we're just
0:14
I if I heard it correctly we're just
0:14
I if I heard it correctly we're just talking about uh we're all talk like
0:16
talking about uh we're all talk like
0:16
talking about uh we're all talk like we're all talking these days about using
0:18
we're all talking these days about using
0:18
we're all talking these days about using generative AI uh both to advantage
0:22
generative AI uh both to advantage
0:22
generative AI uh both to advantage ourselves as coders but also to help
0:25
ourselves as coders but also to help
0:25
ourselves as coders but also to help organizations I'm here today to share a
0:27
organizations I'm here today to share a
0:27
organizations I'm here today to share a little bit about really the
0:29
little bit about really the
0:30
little bit about really the organizational view of how uh major
0:33
organizational view of how uh major
0:33
organizational view of how uh major organizations around the world are
0:35
organizations around the world are
0:35
organizations around the world are thinking about the use of generative AI
0:37
thinking about the use of generative AI
0:38
thinking about the use of generative AI code and in particular what they're
0:40
code and in particular what they're
0:40
code and in particular what they're looking to do is to understand how much
0:43
looking to do is to understand how much
0:43
looking to do is to understand how much gen code is used and so that's what I
0:46
gen code is used and so that's what I
0:46
gen code is used and so that's what I mean when I talk about gen code
0:49
mean when I talk about gen code
0:49
mean when I talk about gen code transparency it's knowing how much geni
0:52
transparency it's knowing how much geni
0:52
transparency it's knowing how much geni is used where in the code and how much
0:56
is used where in the code and how much
0:56
is used where in the code and how much uh because there's some some benefits
0:57
uh because there's some some benefits
0:57
uh because there's some some benefits and some risks depending on on how it's
0:59
and some risks depending on on how it's
0:59
and some risks depending on on how it's used used so I um will go through this
1:03
used used so I um will go through this
1:03
used used so I um will go through this presentation really looking forward to
1:04
presentation really looking forward to
1:04
presentation really looking forward to hearing your questions um at the end but
1:07
hearing your questions um at the end but
1:07
hearing your questions um at the end but again please feel free uh to jump in and
1:09
again please feel free uh to jump in and
1:09
again please feel free uh to jump in and and Simon will share them with
1:11
and Simon will share them with
1:11
and Simon will share them with me a little bit about me uh some of you
1:14
me a little bit about me uh some of you
1:14
me a little bit about me uh some of you have at least heard of the programming
1:16
have at least heard of the programming
1:16
have at least heard of the programming language basic which is what I learned a
1:18
language basic which is what I learned a
1:18
language basic which is what I learned a long time ago on a very very very slow
1:21
long time ago on a very very very slow
1:21
long time ago on a very very very slow computer although it was fast at the
1:22
computer although it was fast at the
1:22
computer although it was fast at the time um I've had the great pleasure of
1:25
time um I've had the great pleasure of
1:25
time um I've had the great pleasure of serving
1:26
serving
1:26
serving governments uh uh investors and
1:29
governments uh uh investors and
1:29
governments uh uh investors and operating companies over the last 25
1:31
operating companies over the last 25
1:31
operating companies over the last 25 years on technology and strategy for the
1:35
years on technology and strategy for the
1:35
years on technology and strategy for the last seven years I've led the company I
1:37
last seven years I've led the company I
1:37
last seven years I've led the company I founded SEMA which is built to make Tech
1:41
founded SEMA which is built to make Tech
1:41
founded SEMA which is built to make Tech understandable to non-technical
1:44
understandable to non-technical
1:44
understandable to non-technical audiences we um we saw a real Gap um
1:48
audiences we um we saw a real Gap um
1:48
audiences we um we saw a real Gap um technologists understand the code and
1:50
technologists understand the code and
1:50
technologists understand the code and can talk to each other but when it comes
1:52
can talk to each other but when it comes
1:52
can talk to each other but when it comes to chief executive officers who are not
1:54
to chief executive officers who are not
1:54
to chief executive officers who are not technical or Boards of directors it's
1:56
technical or Boards of directors it's
1:56
technical or Boards of directors it's really important um I'd say for
1:58
really important um I'd say for
1:58
really important um I'd say for individual careers as well as as you as
2:01
individual careers as well as as you as
2:01
individual careers as well as as you as you go up the career ladder to be able
2:03
you go up the career ladder to be able
2:03
you go up the career ladder to be able to explain Tech to to those
2:05
to explain Tech to to those
2:05
to explain Tech to to those non-technical
2:06
non-technical
2:06
non-technical audiences we uh we started working on
2:10
audiences we uh we started working on
2:10
audiences we uh we started working on understanding geni code composition
2:13
understanding geni code composition
2:13
understanding geni code composition about a year ago when two major clients
2:16
about a year ago when two major clients
2:16
about a year ago when two major clients of ours from our first product asked us
2:18
of ours from our first product asked us
2:18
of ours from our first product asked us to start building it we're talking about
2:20
to start building it we're talking about
2:21
to start building it we're talking about product Market fit it's really an honor
2:23
product Market fit it's really an honor
2:23
product Market fit it's really an honor when your current clients ask you to
2:24
when your current clients ask you to
2:24
when your current clients ask you to build something new uh but it also gives
2:27
build something new uh but it also gives
2:27
build something new uh but it also gives us a real a grounding in why this
2:29
us a real a grounding in why this
2:29
us a real a grounding in why this matters to them and what is it that
2:30
matters to them and what is it that
2:30
matters to them and what is it that matters to organizations about gen code
2:34
matters to organizations about gen code
2:34
matters to organizations about gen code transparency and finally a little about
2:36
transparency and finally a little about
2:36
transparency and finally a little about me I have three great kids um we're a
2:39
me I have three great kids um we're a
2:39
me I have three great kids um we're a very arts and craftsy family in the
2:42
very arts and craftsy family in the
2:42
very arts and craftsy family in the bottom right this is a picture um you
2:44
bottom right this is a picture um you
2:44
bottom right this is a picture um you all know what Sushi is uh a tradition
2:47
all know what Sushi is uh a tradition
2:47
all know what Sushi is uh a tradition we've done for more than the last decade
2:49
we've done for more than the last decade
2:49
we've done for more than the last decade we make candy sushi out of marshmallow
2:51
we make candy sushi out of marshmallow
2:51
we make candy sushi out of marshmallow peeps they're really disgusting I would
2:54
peeps they're really disgusting I would
2:54
peeps they're really disgusting I would not recommend eating them but they're
2:55
not recommend eating them but they're
2:55
not recommend eating them but they're they're a lot of fun um when when
2:57
they're a lot of fun um when when
2:57
they're a lot of fun um when when marshmallow peeps are around
3:00
marshmallow peeps are around
3:00
marshmallow peeps are around so let's jump in we're going to talk
3:02
so let's jump in we're going to talk
3:02
so let's jump in we're going to talk about three questions today and answer
3:05
about three questions today and answer
3:05
about three questions today and answer them for you so hopefully by the end of
3:07
them for you so hopefully by the end of
3:07
them for you so hopefully by the end of this you could do a pretty good job
3:10
this you could do a pretty good job
3:10
this you could do a pretty good job explaining the answers to these three
3:12
explaining the answers to these three
3:12
explaining the answers to these three questions to anybody technical or not
3:15
questions to anybody technical or not
3:15
questions to anybody technical or not technical question number one what is
3:18
technical question number one what is
3:18
technical question number one what is Gen code transparency we talked a little
3:20
Gen code transparency we talked a little
3:20
Gen code transparency we talked a little bit about it already but I want to give
3:22
bit about it already but I want to give
3:22
bit about it already but I want to give you some
3:23
you some
3:24
you some Nuance question number two why should I
3:27
Nuance question number two why should I
3:27
Nuance question number two why should I as an engineer care about gen code
3:29
as an engineer care about gen code
3:29
as an engineer care about gen code transpar
3:31
transpar
3:31
transpar and then question number three how does
3:33
and then question number three how does
3:33
and then question number three how does it work I bet many of you are already
3:35
it work I bet many of you are already
3:35
it work I bet many of you are already thinking about that and you've seen uh
3:37
thinking about that and you've seen uh
3:37
thinking about that and you've seen uh seen discussions of the accuracy or lack
3:41
seen discussions of the accuracy or lack
3:41
seen discussions of the accuracy or lack thereof of detecting uh gen human
3:44
thereof of detecting uh gen human
3:44
thereof of detecting uh gen human writing versus gen uh human language
3:47
writing versus gen uh human language
3:47
writing versus gen uh human language writing versus uh human uh human uh
3:50
writing versus uh human uh human uh
3:50
writing versus uh human uh human uh language writing um thankfully it's a
3:52
language writing um thankfully it's a
3:52
language writing um thankfully it's a little bit easier on the code side but
3:54
little bit easier on the code side but
3:54
little bit easier on the code side but we're Engineers here so we'll we'll talk
3:56
we're Engineers here so we'll we'll talk
3:56
we're Engineers here so we'll we'll talk about the
3:57
about the
3:57
about the how Simon just a quick check because
3:59
how Simon just a quick check because
3:59
how Simon just a quick check because can't see you is this the right pace am
4:01
can't see you is this the right pace am
4:01
can't see you is this the right pace am I still sounding okay perfect and your
4:04
I still sounding okay perfect and your
4:04
I still sounding okay perfect and your pace is all
4:06
pace is all
4:06
pace is all right perfect well let's let's get in
4:09
right perfect well let's let's get in
4:09
right perfect well let's let's get in part one um even before I do that part
4:13
part one um even before I do that part
4:13
part one um even before I do that part zero uh I want to make it incredibly
4:15
zero uh I want to make it incredibly
4:15
zero uh I want to make it incredibly clear how supportive gen uh SEMA is of
4:19
clear how supportive gen uh SEMA is of
4:19
clear how supportive gen uh SEMA is of using gen code uh coding tools whether
4:23
using gen code uh coding tools whether
4:23
using gen code uh coding tools whether it's chat GPT or co-pilot some of them
4:27
it's chat GPT or co-pilot some of them
4:27
it's chat GPT or co-pilot some of them are some tiers of those tools are are
4:29
are some tiers of those tools are are
4:29
are some tiers of those tools are are better in certain circumstances uh if
4:31
better in certain circumstances uh if
4:31
better in certain circumstances uh if you're using it in a corporate setting
4:33
you're using it in a corporate setting
4:33
you're using it in a corporate setting you really want to make sure you're
4:34
you really want to make sure you're
4:34
you really want to make sure you're using an Enterprise grade level of it uh
4:38
using an Enterprise grade level of it uh
4:38
using an Enterprise grade level of it uh but basically we think there are two uh
4:41
but basically we think there are two uh
4:42
but basically we think there are two uh two categories of reasons why we like
4:44
two categories of reasons why we like
4:44
two categories of reasons why we like gen code usage one is it's good for
4:47
gen code usage one is it's good for
4:47
gen code usage one is it's good for developers the second is it's good for
4:50
developers the second is it's good for
4:50
developers the second is it's good for overall Excuse Me overall engineering
4:52
overall Excuse Me overall engineering
4:53
overall Excuse Me overall engineering productivity so for that the
4:54
productivity so for that the
4:54
productivity so for that the organization benefits as well and I know
4:57
organization benefits as well and I know
4:57
organization benefits as well and I know almost all of you have tried it if not I
4:59
almost all of you have tried it if not I
4:59
almost all of you have tried it if not I really encourage you immediately to
5:01
really encourage you immediately to
5:01
really encourage you immediately to start playing with it because it is the
5:02
start playing with it because it is the
5:02
start playing with it because it is the future of coding um and not not just for
5:06
future of coding um and not not just for
5:06
future of coding um and not not just for that sake but as folks who use it they
5:09
that sake but as folks who use it they
5:09
that sake but as folks who use it they they will tell you it um helps avoid
5:12
they will tell you it um helps avoid
5:12
they will tell you it um helps avoid mundane work it helps experiment faster
5:15
mundane work it helps experiment faster
5:15
mundane work it helps experiment faster it enables you to have a you know a
5:17
it enables you to have a you know a
5:17
it enables you to have a you know a realtime code reviewer um who's not
5:20
realtime code reviewer um who's not
5:20
realtime code reviewer um who's not always going to be right but at least
5:22
always going to be right but at least
5:22
always going to be right but at least there with you a as you are coding so
5:24
there with you a as you are coding so
5:24
there with you a as you are coding so the the overall um overall satisfaction
5:28
the the overall um overall satisfaction
5:28
the the overall um overall satisfaction is that it's much higher for coders
5:30
is that it's much higher for coders
5:30
is that it's much higher for coders using gen um than without it also
5:34
using gen um than without it also
5:34
using gen um than without it also substantially increases uh engineering
5:37
substantially increases uh engineering
5:37
substantially increases uh engineering productivity um the we estimate there's
5:41
productivity um the we estimate there's
5:41
productivity um the we estimate there's probably a 40 times return um on the co
5:45
probably a 40 times return um on the co
5:45
probably a 40 times return um on the co relative to the cost of buying a license
5:48
relative to the cost of buying a license
5:48
relative to the cost of buying a license to um the benefits for the organization
5:51
to um the benefits for the organization
5:51
to um the benefits for the organization so if you're working in an organization
5:53
so if you're working in an organization
5:53
so if you're working in an organization and they have not bought you an
5:54
and they have not bought you an
5:54
and they have not bought you an Enterprise license uh please send them
5:57
Enterprise license uh please send them
5:57
Enterprise license uh please send them this presentation because it is
6:00
this presentation because it is
6:00
this presentation because it is really good for the business and you
6:02
really good for the business and you
6:02
really good for the business and you really want the business should want um
6:04
really want the business should want um
6:04
really want the business should want um you to be using a really highgrade tool
6:07
you to be using a really highgrade tool
6:07
you to be using a really highgrade tool maybe even more than one tool some
6:08
maybe even more than one tool some
6:08
maybe even more than one tool some companies um buy licenses for for more
6:11
companies um buy licenses for for more
6:11
companies um buy licenses for for more than one product so the developers can
6:12
than one product so the developers can
6:12
than one product so the developers can use them in different circumstances so
6:15
use them in different circumstances so
6:15
use them in different circumstances so we are big fans of gen coding tools and
6:18
we are big fans of gen coding tools and
6:18
we are big fans of gen coding tools and I want to be very clear about
6:20
I want to be very clear about
6:20
I want to be very clear about that we have become um you know experts
6:24
that we have become um you know experts
6:24
that we have become um you know experts in this newer field of AI code
6:28
in this newer field of AI code
6:28
in this newer field of AI code transparency so what do we mean by that
6:30
transparency so what do we mean by that
6:30
transparency so what do we mean by that there's really two parts number one it's
6:33
there's really two parts number one it's
6:33
there's really two parts number one it's distinguishing code that originated with
6:36
distinguishing code that originated with
6:36
distinguishing code that originated with Gen tools with they could be generic
6:39
Gen tools with they could be generic
6:39
Gen tools with they could be generic like chat GPT or specific to coding like
6:41
like chat GPT or specific to coding like
6:41
like chat GPT or specific to coding like get up co-pilot versus originated not
6:44
get up co-pilot versus originated not
6:45
get up co-pilot versus originated not gen so everything else that didn't come
6:47
gen so everything else that didn't come
6:47
gen so everything else that didn't come as a result of a prompt is not gen uh it
6:49
as a result of a prompt is not gen uh it
6:49
as a result of a prompt is not gen uh it could be written by the team it could be
6:51
could be written by the team it could be
6:52
could be written by the team it could be open source code uh it could be
6:54
open source code uh it could be
6:54
open source code uh it could be autogenerated code that came from um you
6:58
autogenerated code that came from um you
6:58
autogenerated code that came from um you know a scaffolding uh or other framework
7:01
know a scaffolding uh or other framework
7:01
know a scaffolding uh or other framework code that wasn't human written but it
7:04
code that wasn't human written but it
7:04
code that wasn't human written but it didn't come out of a prompt and and the
7:06
didn't come out of a prompt and and the
7:06
didn't come out of a prompt and and the real question that is on people's minds
7:08
real question that is on people's minds
7:08
real question that is on people's minds is did it come out of a prompt or not
7:09
is did it come out of a prompt or not
7:10
is did it come out of a prompt or not which is why it's gen or not gen I'll
7:13
which is why it's gen or not gen I'll
7:13
which is why it's gen or not gen I'll come back and talk about autogenerated
7:14
come back and talk about autogenerated
7:14
come back and talk about autogenerated at the end because that's that's one of
7:16
at the end because that's that's one of
7:16
at the end because that's that's one of the things that makes it hard to detect
7:18
the things that makes it hard to detect
7:18
the things that makes it hard to detect um uh detect
7:20
um uh detect
7:20
um uh detect easily second part is once we've decided
7:23
easily second part is once we've decided
7:23
easily second part is once we've decided it's gen we want to distinguish between
7:27
it's gen we want to distinguish between
7:27
it's gen we want to distinguish between gen that was uh unmodified that came
7:31
gen that was uh unmodified that came
7:31
gen that was uh unmodified that came straight from the prompt we call that
7:33
straight from the prompt we call that
7:33
straight from the prompt we call that pure gen or did it get Modified by
7:36
pure gen or did it get Modified by
7:36
pure gen or did it get Modified by developers after the fact and we call
7:39
developers after the fact and we call
7:39
developers after the fact and we call that Blended and as we're going to see
7:42
that Blended and as we're going to see
7:42
that Blended and as we're going to see through this
7:43
through this
7:43
through this presentation uh Blended code uh is much
7:47
presentation uh Blended code uh is much
7:47
presentation uh Blended code uh is much safer than uh than pure gen code I'm
7:52
safer than uh than pure gen code I'm
7:52
safer than uh than pure gen code I'm going to pause for a second You'
7:55
going to pause for a second You'
7:55
going to pause for a second You' probably all almost all of you have
7:57
probably all almost all of you have
7:57
probably all almost all of you have tried generative AI coding tools I hope
8:00
tried generative AI coding tools I hope
8:00
tried generative AI coding tools I hope you have maybe some of you are skeptical
8:03
you have maybe some of you are skeptical
8:03
you have maybe some of you are skeptical and maybe some of you are skeptical
8:06
and maybe some of you are skeptical
8:06
and maybe some of you are skeptical because you're wondering if it if you
8:08
because you're wondering if it if you
8:08
because you're wondering if it if you really are a real developer if you're
8:10
really are a real developer if you're
8:10
really are a real developer if you're using a tool like this we've certainly
8:13
using a tool like this we've certainly
8:13
using a tool like this we've certainly heard that as we've interviewed uh
8:16
heard that as we've interviewed uh
8:16
heard that as we've interviewed uh coders around the world H I don't know
8:18
coders around the world H I don't know
8:18
coders around the world H I don't know am I cheating if I do this or not and
8:21
am I cheating if I do this or not and
8:21
am I cheating if I do this or not and when people think that there's there's
8:23
when people think that there's there's
8:23
when people think that there's there's nothing nothing wrong with that and
8:24
nothing nothing wrong with that and
8:24
nothing nothing wrong with that and Frank Frankly it's to to wonder about
8:26
Frank Frankly it's to to wonder about
8:26
Frank Frankly it's to to wonder about that um uh and we applaud you thinking
8:29
that um uh and we applaud you thinking
8:29
that um uh and we applaud you thinking about what does it mean to really
8:31
about what does it mean to really
8:31
about what does it mean to really respect your craft I have a very simple
8:34
respect your craft I have a very simple
8:34
respect your craft I have a very simple answer replace the phrase gen code with
8:38
answer replace the phrase gen code with
8:38
answer replace the phrase gen code with open source and see what your answer is
8:42
open source and see what your answer is
8:42
open source and see what your answer is so let's go back to this does using open
8:45
so let's go back to this does using open
8:45
so let's go back to this does using open source code uh increase developer job
8:50
source code uh increase developer job
8:50
source code uh increase developer job satisfaction I'm sure you would say yes
8:52
satisfaction I'm sure you would say yes
8:53
satisfaction I'm sure you would say yes the idea of rebuilding Frameworks or
8:55
the idea of rebuilding Frameworks or
8:55
the idea of rebuilding Frameworks or rebuilding modules or libraries that
8:57
rebuilding modules or libraries that
8:57
rebuilding modules or libraries that already exist when you know there's a
8:59
already exist when you know there's a
8:59
already exist when you know there's a there's a simpler answer it's much more
9:01
there's a simpler answer it's much more
9:01
there's a simpler answer it's much more satisfying uh to uh to use an open
9:04
satisfying uh to uh to use an open
9:04
satisfying uh to uh to use an open source answer
9:05
source answer
9:05
source answer instead it also if you if you think
9:07
instead it also if you if you think
9:08
instead it also if you if you think about for a moment open source makes
9:10
about for a moment open source makes
9:10
about for a moment open source makes developers a lot more productive uh in
9:13
developers a lot more productive uh in
9:13
developers a lot more productive uh in that you don't need to build the
9:16
that you don't need to build the
9:16
that you don't need to build the frontend uh framework you can just have
9:18
frontend uh framework you can just have
9:18
frontend uh framework you can just have one and then make add your customization
9:20
one and then make add your customization
9:20
one and then make add your customization to it so you're already we know that
9:23
to it so you're already we know that
9:23
to it so you're already we know that open source is um is almost universally
9:26
open source is um is almost universally
9:26
open source is um is almost universally adopted you're already using a tool that
9:28
adopted you're already using a tool that
9:28
adopted you're already using a tool that makes it easier for you to code and
9:30
makes it easier for you to code and
9:30
makes it easier for you to code and focus on the parts that only you can do
9:32
focus on the parts that only you can do
9:32
focus on the parts that only you can do that only you can do and there's such a
9:34
that only you can do and there's such a
9:34
that only you can do and there's such a value in that uh that that's true for
9:37
value in that uh that that's true for
9:37
value in that uh that that's true for open source and it's true for Gen code
9:40
open source and it's true for Gen code
9:40
open source and it's true for Gen code as well so if anyone says I don't know
9:42
as well so if anyone says I don't know
9:42
as well so if anyone says I don't know if it makes me real if I'm not sure I'm
9:44
if it makes me real if I'm not sure I'm
9:44
if it makes me real if I'm not sure I'm I'm a real developer because I use it as
9:46
I'm a real developer because I use it as
9:46
I'm a real developer because I use it as to use open source it's really the same
9:48
to use open source it's really the same
9:48
to use open source it's really the same thing there is however one really big
9:51
thing there is however one really big
9:51
thing there is however one really big difference which is why I wanted to
9:52
difference which is why I wanted to
9:52
difference which is why I wanted to bring this back up between open source
9:54
bring this back up between open source
9:54
bring this back up between open source and gen code as you hopefully know if
9:57
and gen code as you hopefully know if
9:57
and gen code as you hopefully know if not um a quick Refresher for open source
10:00
not um a quick Refresher for open source
10:00
not um a quick Refresher for open source code to be the safest what you want to
10:03
code to be the safest what you want to
10:03
code to be the safest what you want to do in all situations or almost all
10:05
do in all situations or almost all
10:05
do in all situations or almost all situations is not modify it you want to
10:10
situations is not modify it you want to
10:10
situations is not modify it you want to reference the most recent version you
10:12
reference the most recent version you
10:12
reference the most recent version you want to keep up to date with the most
10:14
want to keep up to date with the most
10:14
want to keep up to date with the most recent version one reason for that is uh
10:18
recent version one reason for that is uh
10:18
recent version one reason for that is uh the code maintainability another reason
10:20
the code maintainability another reason
10:20
the code maintainability another reason is the security uh open source codes has
10:24
is the security uh open source codes has
10:24
is the security uh open source codes has security uh vulnerabilities uh that are
10:26
security uh vulnerabilities uh that are
10:26
security uh vulnerabilities uh that are publicly known called cve and staying up
10:29
publicly known called cve and staying up
10:29
publicly known called cve and staying up to to date with the most recent version
10:30
to to date with the most recent version
10:30
to to date with the most recent version helps you achieve those um um helps you
10:35
helps you achieve those um um helps you
10:35
helps you achieve those um um helps you uh um uh fix uh you know automatically
10:38
uh um uh fix uh you know automatically
10:39
uh um uh fix uh you know automatically fixes uh the errors certainly the vast
10:40
fixes uh the errors certainly the vast
10:40
fixes uh the errors certainly the vast majority of them the security errors uh
10:43
majority of them the security errors uh
10:43
majority of them the security errors uh and so for open source code you do not
10:45
and so for open source code you do not
10:45
and so for open source code you do not want to blend it you want it to stay
10:48
want to blend it you want it to stay
10:48
want to blend it you want it to stay pure gen code is the opposite uh for
10:51
pure gen code is the opposite uh for
10:51
pure gen code is the opposite uh for reasons we're going to get to and so you
10:53
reasons we're going to get to and so you
10:53
reasons we're going to get to and so you want to think about gen code similar to
10:55
want to think about gen code similar to
10:55
want to think about gen code similar to open source with the exception that you
10:57
open source with the exception that you
10:57
open source with the exception that you really want to make it yours you want to
10:59
really want to make it yours you want to
10:59
really want to make it yours you want to blend it because it becomes safer for a
11:01
blend it because it becomes safer for a
11:01
blend it because it becomes safer for a variety of
11:03
variety of
11:03
variety of reasons Okay so we've talked about what
11:06
reasons Okay so we've talked about what
11:06
reasons Okay so we've talked about what is Gen code transparency I'll just give
11:08
is Gen code transparency I'll just give
11:08
is Gen code transparency I'll just give you a really simple example uh from our
11:10
you a really simple example uh from our
11:10
you a really simple example uh from our own company um this is um so we have two
11:16
own company um this is um so we have two
11:16
own company um this is um so we have two products um uh the newest product is AI
11:20
products um uh the newest product is AI
11:20
products um uh the newest product is AI code monitor which detects the amount of
11:23
code monitor which detects the amount of
11:23
code monitor which detects the amount of gen code in a code base so this is the
11:26
gen code in a code base so this is the
11:26
gen code in a code base so this is the the application it's from scoop is our
11:30
the application it's from scoop is our
11:30
the application it's from scoop is our first product which has been around for
11:32
first product which has been around for
11:32
first product which has been around for seven years and um is a a it's a code
11:37
seven years and um is a a it's a code
11:37
seven years and um is a a it's a code scanning tool it's the right way to
11:38
scanning tool it's the right way to
11:38
scanning tool it's the right way to think about it we also group some of our
11:41
think about it we also group some of our
11:41
think about it we also group some of our code into it's still part of scoop but
11:43
code into it's still part of scoop but
11:43
code into it's still part of scoop but we haven't modified it in the last three
11:46
we haven't modified it in the last three
11:46
we haven't modified it in the last three months and so this is looking at an sort
11:48
months and so this is looking at an sort
11:48
months and so this is looking at an sort of an application Level this is an
11:50
of an application Level this is an
11:50
of an application Level this is an application and these are two parts of
11:51
application and these are two parts of
11:51
application and these are two parts of the application how much geni is there
11:54
the application how much geni is there
11:54
the application how much geni is there if I use this number in the sentence
11:56
if I use this number in the sentence
11:56
if I use this number in the sentence from the AI code monitor 18% of the code
11:59
from the AI code monitor 18% of the code
11:59
from the AI code monitor 18% of the code C originated with Gen that means 1us 18
12:04
C originated with Gen that means 1us 18
12:04
C originated with Gen that means 1us 18 is 82 82% did not originate with Gen and
12:09
is 82 82% did not originate with Gen and
12:09
is 82 82% did not originate with Gen and then of that
12:10
then of that
12:10
then of that 18% at least 8% is
12:13
18% at least 8% is
12:13
18% at least 8% is Blended and uh meaning it was modified
12:17
Blended and uh meaning it was modified
12:17
Blended and uh meaning it was modified at least somewhat we know that as a
12:18
at least somewhat we know that as a
12:19
at least somewhat we know that as a minimum uh up to 10% uh is uh is
12:24
minimum uh up to 10% uh is uh is
12:24
minimum uh up to 10% uh is uh is unmodified for scoop our older product
12:26
unmodified for scoop our older product
12:26
unmodified for scoop our older product and I should say this is all code code
12:28
and I should say this is all code code
12:28
and I should say this is all code code all time
12:30
all time
12:30
all time uh there's much less um gen code in it
12:34
uh there's much less um gen code in it
12:34
uh there's much less um gen code in it 98% of it was generated uh without any
12:37
98% of it was generated uh without any
12:37
98% of it was generated uh without any uh without any
12:39
uh without any
12:39
uh without any gen now you might already be coming to a
12:43
gen now you might already be coming to a
12:43
gen now you might already be coming to a conclusion about whether one of these is
12:44
conclusion about whether one of these is
12:44
conclusion about whether one of these is better or worse uh it's okay to be
12:47
better or worse uh it's okay to be
12:47
better or worse uh it's okay to be naturally curious um at SEMO we love you
12:50
naturally curious um at SEMO we love you
12:50
naturally curious um at SEMO we love you know we build code metrics for a living
12:52
know we build code metrics for a living
12:52
know we build code metrics for a living we love asking questions but we really
12:54
we love asking questions but we really
12:54
we love asking questions but we really try to be curious before we're
12:57
try to be curious before we're
12:57
try to be curious before we're judgmental um because code is a craft
13:00
judgmental um because code is a craft
13:00
judgmental um because code is a craft it's so contextual to know what is the
13:02
it's so contextual to know what is the
13:02
it's so contextual to know what is the right answer in any one's circumstance
13:04
right answer in any one's circumstance
13:04
right answer in any one's circumstance and there may be extenuating
13:06
and there may be extenuating
13:06
and there may be extenuating circumstances before you can come to a
13:07
circumstances before you can come to a
13:08
circumstances before you can come to a conclusion so we like to we like to ask
13:10
conclusion so we like to we like to ask
13:10
conclusion so we like to we like to ask questions uh rather than judge and so a
13:12
questions uh rather than judge and so a
13:12
questions uh rather than judge and so a question here could be is there a good
13:15
question here could be is there a good
13:15
question here could be is there a good reason to have a lot more geni code in
13:20
reason to have a lot more geni code in
13:20
reason to have a lot more geni code in the AI code monitor than in scoop our
13:23
the AI code monitor than in scoop our
13:23
the AI code monitor than in scoop our first
13:24
first
13:24
first product and if I could do a poll I'd see
13:27
product and if I could do a poll I'd see
13:27
product and if I could do a poll I'd see if any of you can think of a compelling
13:29
if any of you can think of a compelling
13:29
if any of you can think of a compelling reason um but since I can't I'll just
13:30
reason um but since I can't I'll just
13:30
reason um but since I can't I'll just tell you there is definitely a good
13:32
tell you there is definitely a good
13:32
tell you there is definitely a good reason and that reason is the AI code
13:35
reason and that reason is the AI code
13:35
reason and that reason is the AI code monitor has been 100% created in the
13:38
monitor has been 100% created in the
13:38
monitor has been 100% created in the last year and in the last year gen
13:41
last year and in the last year gen
13:41
last year and in the last year gen coding tools have been uh available um
13:45
coding tools have been uh available um
13:45
coding tools have been uh available um uh have been available by contrast scoop
13:48
uh have been available by contrast scoop
13:48
uh have been available by contrast scoop is a very stable product uh almost all
13:51
is a very stable product uh almost all
13:51
is a very stable product uh almost all of it was built uh before gen tools uh
13:55
of it was built uh before gen tools uh
13:55
of it was built uh before gen tools uh became around and uh and so most of it
14:00
became around and uh and so most of it
14:00
became around and uh and so most of it uh didn't have an opportunity to have
14:01
uh didn't have an opportunity to have
14:01
uh didn't have an opportunity to have gen in it and so there's there's always
14:04
gen in it and so there's there's always
14:04
gen in it and so there's there's always a story about data whether it's gen code
14:06
a story about data whether it's gen code
14:06
a story about data whether it's gen code or not and so we we lead with inquiry uh
14:09
or not and so we we lead with inquiry uh
14:09
or not and so we we lead with inquiry uh first and I I really encourage you to do
14:11
first and I I really encourage you to do
14:11
first and I I really encourage you to do that as you think about data about
14:15
that as you think about data about
14:15
that as you think about data about code so we've covered what is Gen uh gen
14:19
code so we've covered what is Gen uh gen
14:19
code so we've covered what is Gen uh gen code transparency our first question
14:22
code transparency our first question
14:22
code transparency our first question second question why should I care and
14:24
second question why should I care and
14:24
second question why should I care and I'm going to say to you um you may or
14:27
I'm going to say to you um you may or
14:27
I'm going to say to you um you may or may not care yet but the organizations
14:30
may not care yet but the organizations
14:30
may not care yet but the organizations you're a part of and organizations you
14:31
you're a part of and organizations you
14:31
you're a part of and organizations you might uh uh be working for today or
14:34
might uh uh be working for today or
14:34
might uh uh be working for today or might be working for in the future many
14:36
might be working for in the future many
14:36
might be working for in the future many of them already care and I know you'll
14:38
of them already care and I know you'll
14:38
of them already care and I know you'll remember this conversation in the next
14:40
remember this conversation in the next
14:40
remember this conversation in the next six to 24 months everyone will care
14:43
six to 24 months everyone will care
14:43
six to 24 months everyone will care about knowing how much gen uh is in the
14:46
about knowing how much gen uh is in the
14:46
about knowing how much gen uh is in the code base so consider yourself on The
14:49
code base so consider yourself on The
14:49
code base so consider yourself on The Cutting Edge by being here for this
14:53
presentation I'm gonna start this
14:55
presentation I'm gonna start this
14:55
presentation I'm gonna start this section with um with a new story that
14:57
section with um with a new story that
14:57
section with um with a new story that came out last Friday so axios is a major
15:01
came out last Friday so axios is a major
15:01
came out last Friday so axios is a major Tech publication in the United States
15:03
Tech publication in the United States
15:03
Tech publication in the United States with a global reach when AI code produce
15:07
with a global reach when AI code produce
15:07
with a global reach when AI code produce code goes bad the same generative AI
15:10
code goes bad the same generative AI
15:10
code goes bad the same generative AI tools can produce flawed potentially
15:13
tools can produce flawed potentially
15:13
tools can produce flawed potentially dangerous code it matters of course
15:16
dangerous code it matters of course
15:16
dangerous code it matters of course because generative AI is touch
15:18
because generative AI is touch
15:18
because generative AI is touch generative AI code is touching so much
15:21
generative AI code is touching so much
15:21
generative AI code is touching so much so it's already it's beginning to seep
15:23
so it's already it's beginning to seep
15:23
so it's already it's beginning to seep into the Consciousness that while it's
15:25
into the Consciousness that while it's
15:25
into the Consciousness that while it's an incredibly good idea to be using gen
15:28
an incredibly good idea to be using gen
15:28
an incredibly good idea to be using gen uh for coding
15:29
uh for coding
15:29
uh for coding it also comes with some risks this
15:32
it also comes with some risks this
15:32
it also comes with some risks this article in fact if you read it um points
15:34
article in fact if you read it um points
15:34
article in fact if you read it um points out two of them there's really five
15:36
out two of them there's really five
15:36
out two of them there's really five categories of risks and all five of
15:39
categories of risks and all five of
15:39
categories of risks and all five of those risks are reasons to know the
15:42
those risks are reasons to know the
15:42
those risks are reasons to know the composition uh because if you know how
15:44
composition uh because if you know how
15:44
composition uh because if you know how much gen code is used and where it's
15:46
much gen code is used and where it's
15:46
much gen code is used and where it's used you're then more likely to be able
15:48
used you're then more likely to be able
15:48
used you're then more likely to be able to address it the five are uh the
15:52
to address it the five are uh the
15:52
to address it the five are uh the quality of the code so gen code may be
15:55
quality of the code so gen code may be
15:55
quality of the code so gen code may be less lower quality for example because
15:57
less lower quality for example because
15:57
less lower quality for example because it's hallucinating um
15:59
it's hallucinating um
15:59
it's hallucinating um or not giving appropriate context second
16:03
or not giving appropriate context second
16:03
or not giving appropriate context second gen code comes with security bugs by the
16:05
gen code comes with security bugs by the
16:05
gen code comes with security bugs by the way all code comes with security bugs
16:07
way all code comes with security bugs
16:07
way all code comes with security bugs it's it's there's nothing against that
16:08
it's it's there's nothing against that
16:09
it's it's there's nothing against that you just need to take precautions to
16:10
you just need to take precautions to
16:10
you just need to take precautions to find them and address
16:12
find them and address
16:12
find them and address them not using so here's a risk and this
16:15
them not using so here's a risk and this
16:15
them not using so here's a risk and this is again from the from the
16:16
is again from the from the
16:16
is again from the from the organizational perspective but we would
16:18
organizational perspective but we would
16:18
organizational perspective but we would say if developers aren't using gen
16:20
say if developers aren't using gen
16:20
say if developers aren't using gen enough they're missing out on
16:22
enough they're missing out on
16:22
enough they're missing out on opportunities for for greater job
16:24
opportunities for for greater job
16:25
opportunities for for greater job satisfaction and for organizational
16:27
satisfaction and for organizational
16:27
satisfaction and for organizational productivity again imagine if developers
16:30
productivity again imagine if developers
16:30
productivity again imagine if developers weren't using open source and how much
16:31
weren't using open source and how much
16:31
weren't using open source and how much less fun their work would be so
16:33
less fun their work would be so
16:33
less fun their work would be so measuring it helps make sure that
16:34
measuring it helps make sure that
16:34
measuring it helps make sure that they're we're getting to the right level
16:36
they're we're getting to the right level
16:36
they're we're getting to the right level of developer
16:38
of developer
16:38
of developer productivity the fourth is uh fourth
16:41
productivity the fourth is uh fourth
16:41
productivity the fourth is uh fourth risk is that some uses of generative AI
16:45
risk is that some uses of generative AI
16:45
risk is that some uses of generative AI at work uh code or otherwise create some
16:49
at work uh code or otherwise create some
16:49
at work uh code or otherwise create some intellectual property risk uh and like I
16:52
intellectual property risk uh and like I
16:52
intellectual property risk uh and like I said earlier it is a must have if you
16:55
said earlier it is a must have if you
16:55
said earlier it is a must have if you work for a business for that business to
16:58
work for a business for that business to
16:58
work for a business for that business to give you an entprise grade license
17:00
give you an entprise grade license
17:00
give you an entprise grade license almost all of the gen products have one
17:03
almost all of the gen products have one
17:03
almost all of the gen products have one um but you need it to protect the
17:06
um but you need it to protect the
17:06
um but you need it to protect the organizations's code because if you're
17:08
organizations's code because if you're
17:08
organizations's code because if you're not using an Enterprise grade tier your
17:11
not using an Enterprise grade tier your
17:11
not using an Enterprise grade tier your code can be used as part of training
17:12
code can be used as part of training
17:12
code can be used as part of training data and so you're basically giving away
17:14
data and so you're basically giving away
17:15
data and so you're basically giving away organization secrets so please if your
17:18
organization secrets so please if your
17:18
organization secrets so please if your company hasn't um hasn't approved uh a
17:21
company hasn't um hasn't approved uh a
17:21
company hasn't um hasn't approved uh a license send them to me and I will
17:23
license send them to me and I will
17:23
license send them to me and I will explain why it is a it's just a musthave
17:25
explain why it is a it's just a musthave
17:25
explain why it is a it's just a musthave for the safety of the business we don't
17:27
for the safety of the business we don't
17:27
for the safety of the business we don't care which tool it just has to be a tool
17:28
care which tool it just has to be a tool
17:28
care which tool it just has to be a tool with with the right
17:30
with with the right
17:31
with with the right protections the last risk of the five is
17:34
protections the last risk of the five is
17:34
protections the last risk of the five is something we call exit risk and so that
17:36
something we call exit risk and so that
17:36
something we call exit risk and so that exit risk refers to uh generative AI
17:40
exit risk refers to uh generative AI
17:40
exit risk refers to uh generative AI code um being a risk when someone's
17:43
code um being a risk when someone's
17:43
code um being a risk when someone's trying to sell the business or take
17:46
trying to sell the business or take
17:46
trying to sell the business or take investment in the business and so the
17:49
investment in the business and so the
17:49
investment in the business and so the process of looking at the code during um
17:51
process of looking at the code during um
17:51
process of looking at the code during um a sale or uh uh potential investment is
17:54
a sale or uh uh potential investment is
17:54
a sale or uh uh potential investment is called technical due diligence and
17:57
called technical due diligence and
17:57
called technical due diligence and leading custo leading uh investors
17:59
leading custo leading uh investors
17:59
leading custo leading uh investors around the world are already looking at
18:01
around the world are already looking at
18:01
around the world are already looking at how much gen code is in the code base
18:04
how much gen code is in the code base
18:04
how much gen code is in the code base and under certain circumstances uh it
18:07
and under certain circumstances uh it
18:07
and under certain circumstances uh it could be harder to sell the business so
18:09
could be harder to sell the business so
18:09
could be harder to sell the business so that is a very very very high stakes
18:12
that is a very very very high stakes
18:12
that is a very very very high stakes decision uh for the owners and investors
18:15
decision uh for the owners and investors
18:15
decision uh for the owners and investors of a company and so um if as um gen code
18:20
of a company and so um if as um gen code
18:20
of a company and so um if as um gen code composition becomes part of um uh part
18:25
composition becomes part of um uh part
18:25
composition becomes part of um uh part of the due diligence process becomes
18:28
of the due diligence process becomes
18:28
of the due diligence process becomes even more important to understand it
18:30
even more important to understand it
18:30
even more important to understand it along the way how much your composition
18:33
along the way how much your composition
18:33
along the way how much your composition is so those are the five categories of
18:35
is so those are the five categories of
18:35
is so those are the five categories of reasons you again you're now an you're
18:37
reasons you again you're now an you're
18:37
reasons you again you're now an you're among the experts on Earth because this
18:39
among the experts on Earth because this
18:39
among the experts on Earth because this is such a new field uh even you know a
18:42
is such a new field uh even you know a
18:42
is such a new field uh even you know a fancy article fancy publication like
18:44
fancy article fancy publication like
18:44
fancy article fancy publication like axos only thought about two of the five
18:47
axos only thought about two of the five
18:47
axos only thought about two of the five you know now you now know all five
18:51
reasons remember at the beginning when
18:53
reasons remember at the beginning when
18:53
reasons remember at the beginning when we talked about the difference between
18:55
we talked about the difference between
18:55
we talked about the difference between the big difference between open source
18:56
the big difference between open source
18:57
the big difference between open source and gen uh code is that for open source
18:59
and gen uh code is that for open source
18:59
and gen uh code is that for open source you do not want to blend for Gen you do
19:03
you do not want to blend for Gen you do
19:03
you do not want to blend for Gen you do I'm just going to say this briefly
19:04
I'm just going to say this briefly
19:04
I'm just going to say this briefly because I bet getting into questions is
19:05
because I bet getting into questions is
19:05
because I bet getting into questions is more fun but basically every risk is
19:09
more fun but basically every risk is
19:09
more fun but basically every risk is reduced by developers taking the code
19:12
reduced by developers taking the code
19:12
reduced by developers taking the code coming out of the prompt and customizing
19:14
coming out of the prompt and customizing
19:14
coming out of the prompt and customizing it modifying it for the circumstances
19:15
it modifying it for the circumstances
19:15
it modifying it for the circumstances making sure it's correct putting it
19:17
making sure it's correct putting it
19:17
making sure it's correct putting it through security gates I think we're
19:18
through security gates I think we're
19:18
through security gates I think we're just talking about this uh quality Gates
19:21
just talking about this uh quality Gates
19:21
just talking about this uh quality Gates run in the in the previous session um we
19:24
run in the in the previous session um we
19:24
run in the in the previous session um we want you to use it if it's up to us we
19:25
want you to use it if it's up to us we
19:25
want you to use it if it's up to us we want you to use it but uh J but we want
19:28
want you to use it but uh J but we want
19:28
want you to use it but uh J but we want you to make it yours and really make
19:30
you to make it yours and really make
19:30
you to make it yours and really make sure it is correct secure
19:34
contextual lastly um just a little bit
19:38
contextual lastly um just a little bit
19:38
contextual lastly um just a little bit and then we'll go to questions um
19:39
and then we'll go to questions um
19:39
and then we'll go to questions um because I know it's always more fun to
19:41
because I know it's always more fun to
19:41
because I know it's always more fun to have the discussion uh some people may
19:43
have the discussion uh some people may
19:43
have the discussion uh some people may wonder how on Earth do you detect gen
19:46
wonder how on Earth do you detect gen
19:46
wonder how on Earth do you detect gen code um just in case the uh the segue
19:50
code um just in case the uh the segue
19:50
code um just in case the uh the segue wasn't clear I talked a little bit about
19:51
wasn't clear I talked a little bit about
19:51
wasn't clear I talked a little bit about detecting human language um use of gen
19:54
detecting human language um use of gen
19:54
detecting human language um use of gen code so plagiarism detectors you think
19:57
code so plagiarism detectors you think
19:57
code so plagiarism detectors you think about that that is refers to uh folks
20:02
about that that is refers to uh folks
20:02
about that that is refers to uh folks passing off work as their own and there
20:05
passing off work as their own and there
20:05
passing off work as their own and there are now plagiarism detectors that are
20:07
are now plagiarism detectors that are
20:07
are now plagiarism detectors that are looking for um is this paper for school
20:11
looking for um is this paper for school
20:11
looking for um is this paper for school um was it written by a gen tool or was a
20:14
um was it written by a gen tool or was a
20:14
um was it written by a gen tool or was a written by a
20:15
written by a
20:15
written by a human plagiarism another way of saying
20:18
human plagiarism another way of saying
20:18
human plagiarism another way of saying that is Gen detection of human
20:23
that is Gen detection of human
20:23
that is Gen detection of human languages and what SEMA is doing and you
20:26
languages and what SEMA is doing and you
20:26
languages and what SEMA is doing and you know folks in the space are doing is Gen
20:28
know folks in the space are doing is Gen
20:28
know folks in the space are doing is Gen protction for computer
20:31
protction for computer
20:31
protction for computer languages the when you say plagurism
20:33
languages the when you say plagurism
20:33
languages the when you say plagurism you're looking for it because you don't
20:35
you're looking for it because you don't
20:35
you're looking for it because you don't want it there uh and so the goal of
20:37
want it there uh and so the goal of
20:37
want it there uh and so the goal of course in in the academic settings is to
20:39
course in in the academic settings is to
20:39
course in in the academic settings is to not use gen where it's not approved in
20:43
not use gen where it's not approved in
20:43
not use gen where it's not approved in this context there are a few
20:45
this context there are a few
20:45
this context there are a few circumstances where gen should not be
20:47
circumstances where gen should not be
20:47
circumstances where gen should not be used and so the detection is about not
20:49
used and so the detection is about not
20:49
used and so the detection is about not using it it's all based only on company
20:51
using it it's all based only on company
20:51
using it it's all based only on company policy sometimes companies say we just
20:54
policy sometimes companies say we just
20:54
policy sometimes companies say we just don't want this part of the code to be
20:55
don't want this part of the code to be
20:55
don't want this part of the code to be using it but the vast majority to be
20:57
using it but the vast majority to be
20:57
using it but the vast majority to be using gen but the vast majority of
21:00
using gen but the vast majority of
21:00
using gen but the vast majority of situations um involve um it is
21:04
situations um involve um it is
21:04
situations um involve um it is appropriate to use it but how much is it
21:06
appropriate to use it but how much is it
21:06
appropriate to use it but how much is it used and how much is it Blended are the
21:09
used and how much is it Blended are the
21:09
used and how much is it Blended are the really important
21:11
really important
21:11
really important parts in order to detect gen code um
21:14
parts in order to detect gen code um
21:14
parts in order to detect gen code um this so just a little bit of our of our
21:16
this so just a little bit of our of our
21:16
this so just a little bit of our of our secret sauce uh we have an AI model that
21:19
secret sauce uh we have an AI model that
21:19
secret sauce uh we have an AI model that is a deep learning model that is trained
21:22
is a deep learning model that is trained
21:22
is a deep learning model that is trained on gen code and not gen code to create
21:26
on gen code and not gen code to create
21:26
on gen code and not gen code to create predictions about whether the next the
21:29
predictions about whether the next the
21:29
predictions about whether the next the next token is going to be gen gen or not
21:33
next token is going to be gen gen or not
21:33
next token is going to be gen gen or not we then add to it a blending calculator
21:36
we then add to it a blending calculator
21:36
we then add to it a blending calculator to decide if certain code was modified
21:39
to decide if certain code was modified
21:39
to decide if certain code was modified by developers or not and again I'll say
21:42
by developers or not and again I'll say
21:42
by developers or not and again I'll say just a little bit about this in order
21:44
just a little bit about this in order
21:44
just a little bit about this in order to uh detect gen code to use a detection
21:48
to uh detect gen code to use a detection
21:48
to uh detect gen code to use a detection engine you can't just feed the whole
21:49
engine you can't just feed the whole
21:49
engine you can't just feed the whole thing in as a you know as an entirety
21:52
thing in as a you know as an entirety
21:52
thing in as a you know as an entirety and say is there gen in it or not I
21:54
and say is there gen in it or not I
21:54
and say is there gen in it or not I guess you could but it's not interesting
21:56
guess you could but it's not interesting
21:56
guess you could but it's not interesting to say you know if you think about 100
21:58
to say you know if you think about 100
21:58
to say you know if you think about 100 ,000 line
21:59
,000 line
21:59
,000 line codebase yes no was there gen in it
22:02
codebase yes no was there gen in it
22:02
codebase yes no was there gen in it that's not so useful you want it to be
22:04
that's not so useful you want it to be
22:04
that's not so useful you want it to be as a as a granular unit as possible
22:07
as a as a granular unit as possible
22:07
as a as a granular unit as possible while still being able to have an
22:08
while still being able to have an
22:08
while still being able to have an accurate detection and so that process
22:11
accurate detection and so that process
22:11
accurate detection and so that process is called chunking there's an art and
22:12
is called chunking there's an art and
22:13
is called chunking there's an art and science to getting the code that's being
22:15
science to getting the code that's being
22:15
science to getting the code that's being evaluated just the right size that's
22:16
evaluated just the right size that's
22:16
evaluated just the right size that's what chunkin uh chunking is
22:19
what chunkin uh chunking is
22:19
what chunkin uh chunking is about the way that we train the model uh
22:22
about the way that we train the model uh
22:22
about the way that we train the model uh is um uh you know you have gen code and
22:26
is um uh you know you have gen code and
22:26
is um uh you know you have gen code and not gen code gen code code we
22:29
not gen code gen code code we
22:29
not gen code gen code code we synthetically generated by feeding um uh
22:34
synthetically generated by feeding um uh
22:34
synthetically generated by feeding um uh instructions into a gen tool and asking
22:36
instructions into a gen tool and asking
22:36
instructions into a gen tool and asking it for code doing that at scale
22:38
it for code doing that at scale
22:38
it for code doing that at scale obviously and not gen code we picked uh
22:41
obviously and not gen code we picked uh
22:41
obviously and not gen code we picked uh examples that came from before gen tools
22:44
examples that came from before gen tools
22:44
examples that came from before gen tools were uh in common place so using older
22:47
were uh in common place so using older
22:47
were uh in common place so using older slightly older
22:48
slightly older
22:48
slightly older code last thing I'll say about blending
22:51
code last thing I'll say about blending
22:51
code last thing I'll say about blending um uh is we have a very simplistic model
22:55
um uh is we have a very simplistic model
22:55
um uh is we have a very simplistic model today where we at least understand if
22:57
today where we at least understand if
22:57
today where we at least understand if there's been more than one commit um
23:00
there's been more than one commit um
23:00
there's been more than one commit um what's going to be coming in the years
23:01
what's going to be coming in the years
23:01
what's going to be coming in the years ahead is more and more sophisticated
23:02
ahead is more and more sophisticated
23:02
ahead is more and more sophisticated ways of measuring it but we wanted to we
23:04
ways of measuring it but we wanted to we
23:04
ways of measuring it but we wanted to we wanted to start
23:06
wanted to start
23:06
wanted to start somewhere two things and then I will
23:08
somewhere two things and then I will
23:08
somewhere two things and then I will stop uh three things and I will stop um
23:11
stop uh three things and I will stop um
23:11
stop uh three things and I will stop um if you decide that you care as an
23:13
if you decide that you care as an
23:13
if you decide that you care as an organization about gen code composition
23:15
organization about gen code composition
23:15
organization about gen code composition we think there's two more things that um
23:18
we think there's two more things that um
23:18
we think there's two more things that um it's not enough to just look at it at a
23:19
it's not enough to just look at it at a
23:19
it's not enough to just look at it at a dashboard view you have to make it easy
23:21
dashboard view you have to make it easy
23:21
dashboard view you have to make it easy you have to make it more useful one way
23:23
you have to make it more useful one way
23:23
you have to make it more useful one way to make it more useful is for developers
23:25
to make it more useful is for developers
23:25
to make it more useful is for developers to have uh ch code detection when
23:29
to have uh ch code detection when
23:29
to have uh ch code detection when they're doing codee reviews almost all
23:31
they're doing codee reviews almost all
23:31
they're doing codee reviews almost all of you are familiar with seeing quality
23:33
of you are familiar with seeing quality
23:33
of you are familiar with seeing quality Gates or security dates during the code
23:34
Gates or security dates during the code
23:34
Gates or security dates during the code review process it's just a magical way
23:37
review process it's just a magical way
23:37
review process it's just a magical way for code to get better and coders to get
23:38
for code to get better and coders to get
23:38
for code to get better and coders to get better and so gen detection should also
23:41
better and so gen detection should also
23:41
better and so gen detection should also code detection should also be uh at that
23:44
code detection should also be uh at that
23:44
code detection should also be uh at that stage and then the other because there's
23:47
stage and then the other because there's
23:47
stage and then the other because there's all these high stakes decisions like am
23:49
all these high stakes decisions like am
23:49
all these high stakes decisions like am I going to pass diligence or not um
23:51
I going to pass diligence or not um
23:51
I going to pass diligence or not um there's this notion of attestation which
23:53
there's this notion of attestation which
23:53
there's this notion of attestation which means um humans going through and either
23:57
means um humans going through and either
23:57
means um humans going through and either agreeing with or overriding W in
23:59
agreeing with or overriding W in
23:59
agreeing with or overriding W in whatever the the label was about gen to
24:02
whatever the the label was about gen to
24:02
whatever the the label was about gen to detect um uh excuse me to to come up
24:05
detect um uh excuse me to to come up
24:06
detect um uh excuse me to to come up with a final a permanent record of how
24:08
with a final a permanent record of how
24:08
with a final a permanent record of how much gen is in the code so you'll you'll
24:10
much gen is in the code so you'll you'll
24:10
much gen is in the code so you'll you'll be hearing a lot about ad testation
24:11
be hearing a lot about ad testation
24:11
be hearing a lot about ad testation you'll probably have to do it um with
24:13
you'll probably have to do it um with
24:13
you'll probably have to do it um with your some of your companies in the years
24:15
your some of your companies in the years
24:15
your some of your companies in the years ahead before I stop um you there are
24:19
ahead before I stop um you there are
24:19
ahead before I stop um you there are several ways to learn more if you find
24:20
several ways to learn more if you find
24:20
several ways to learn more if you find this topic interesting we have a whole
24:22
this topic interesting we have a whole
24:22
this topic interesting we have a whole bunch of blog posts I'm Simon I'm sure
24:24
bunch of blog posts I'm Simon I'm sure
24:24
bunch of blog posts I'm Simon I'm sure you're going to post this uh as a PDF so
24:26
you're going to post this uh as a PDF so
24:26
you're going to post this uh as a PDF so you can find the blog post see
24:29
you can find the blog post see
24:29
you can find the blog post see software.com uh if you are an
24:30
software.com uh if you are an
24:30
software.com uh if you are an engineering manager uh or a CTO
24:33
engineering manager uh or a CTO
24:33
engineering manager uh or a CTO engineering manager is an up we're
24:34
engineering manager is an up we're
24:34
engineering manager is an up we're having it we have ai advisory councils
24:36
having it we have ai advisory councils
24:36
having it we have ai advisory councils feel free to to drop us a note and say
24:38
feel free to to drop us a note and say
24:38
feel free to to drop us a note and say if you're interested free of course uh
24:40
if you're interested free of course uh
24:41
if you're interested free of course uh and if you'd like to try it out um the
24:42
and if you'd like to try it out um the
24:42
and if you'd like to try it out um the AI code monitor that we saw today uh if
24:44
AI code monitor that we saw today uh if
24:44
AI code monitor that we saw today uh if you use GitHub or uh Azure devops um
24:48
you use GitHub or uh Azure devops um
24:48
you use GitHub or uh Azure devops um we're offering twoe trials
24:49
we're offering twoe trials
24:49
we're offering twoe trials [Music]
#Programming
#Software