#140 Background Checking Your Open Source

Subscribe to get the latest

on Wed May 24 2023 17:00:00 GMT-0700 (Pacific Daylight Time)

with Darren W Pulsipher, Michael Mehlberg,

In this episode, Darren interviews Michael Mehlberg about increasing confidence in open source through background checking the open source communities.

Keywords

Listen Here

If you’re a software developer, you know the feeling of pride that comes with creating a popular package or tool that many people find useful. However, this popularity can sometimes attract the attention of attackers who look for vulnerabilities to exploit.

In a recent podcast, software engineer Jay Phelps shared his experience of discovering a vulnerability in a widely-used package he created. After realizing the potential impact of the vulnerability and the sheer number of instances of the package in the wild, Phelps quickly worked to fix the issue to prevent attackers from exploiting it.

This scenario highlights the importance of vigilance for software developers, especially those who create popular packages or tools. While it may be tempting to bask in the glory of a widely-used product, it’s crucial to remember that popularity can also attract attackers. Regular checks and updates to address any vulnerabilities can help protect users and prevent exploitation.

As a software developer, it’s important to approach your work with both pride and caution. While it’s great to contribute to society with your creations, don’t forget to prioritize the security and safety of your users. Stay vigilant and keep your packages up-to-date to prevent vulnerabilities from being exploited by attackers.

Podcast Transcript

Hello, this is Darren

Pulsipher, chief solution,architect of public sector at Intel.

And welcome to Embracing

Digital Transformation,where we investigate effective change,leveraging people process and technology.

On todays episode, Backgroundchecking your OpenSource,

Micheal Mehlberg

CEO of Darksky Technologies.

Michael, welcome to the show.

Thank you for having me.

Hey, Michael, when we first talked,

I was really enamoredwith what you guys do,which we'll get to a second.

But first off, I know my audiencewants to know a little bit more about youand where you come fromand what you're doing now.

Yeah. Happy to share.

I, I do.

I've been in technologyfor as long as I can remember.

Started when I was about 12 years old.

I had a friend that was workingon a compact computer, which is kind of a,you know, compact with a cue,if you remember.

I remember that. Oh, yeah.

There's a funny namebecause this thing was hugeand had a handle on the back,but it probably weighed £50.

But but you could carry it around.

It was compact, Right?

And I remember that.

Exactly right.

And he he had programed this game.

I think he called it in Mortal Kombat overtwo stick figures that werefighting each other.

And I was just floored that you couldthat you could do that.

You know, I never thoughtabout where programs came from.

And so I started programing.

You know, he helped me learnand I started programing and I knew rightthen I was going to get into softwaredevelopment and making video gamesall through high schooland then finally ended up at Purdue

University studying computer scienceand left there to go right into industrywhere I was working on basically D.O.D.software protecting, you know, weaponssystems against tamperingand in reverse engineering.

So I got a crash course onhow to reverse engineer softwareat the first company

I did an internship with.

And I love solving problemsthat turned outto be at least good enoughto keep my job there. Andfrom there, just,you know, got into the whole industryand protecting software,protecting weapon systems.

And I have been, you know, learningever since, and I'm still learning.

Cybersecurity is one of those thingsthat it's kind of a never ending job.

There's always some new attack out thereand some new way to to defend against it.

Well, it doesit does keep us actually busy. Right?

It gives us employed. Right.

So, yeah, I guess we do.

We attribute we like the hackers out therecausing us to have no problem.

We knowthere's plenty ofother problems to solvewithout them creating anymore.

Right. Yeah.

Yeah, very much so. So.

So tell me a little bit aboutwhen I first talked to you went, Wow,

I never thought of thisabout the whole open source.

There's a big push right now on securingthe software supply chain out there.

Open source is a big aspect of that.

Tell me your guys's approach to helpingsecure the software supply chain.

Yeah, I guess.

I guess it helps a kind of to seewhat I've seen over the past

When I when I first got started,we were really just focusedon protecting our operating systemor even even applicationor right as a single applicationif it had some data in itor some algorithmthat was particularly sensitive,you know, we wanted to protect itbecause if an attacker got hands on, then,you know, they would understand itor they wouldthey would be able to reveal,you know,the secrets that were in in the software.

And sowe started with with just applicationsstartedthen protecting bigger systems, right?

Operating systemsand and things of that nature.

And over time,the open source development communityjust started exploding.

And so while we were learning,you know, how to both break in and defendbreak into and defend systems,there were justthousands and thousands of packagesbeing put out there by the whole opensource community, which was phenomenalbecause first thing you do as a developeris go looking for, you know,somebody who's already solved the problem.

Yeah.

So you don't have to reinvent the wheel.

So eventually, you know, itcame to a point where, all right, there'sa lot of operating systems out there.

There's constantly problemswithin operating systems.

How do you really make a secure one?

Right?

You can patch all this stuffon after the fact.

But what if you missed something, right?

It's it's a tough game, a cat and mousegame that we play with the attacker.

And unfortunately, it's it's in theattacker's favor most of the time becausewe have to get every single we have to getwe have to catch every single bugand every vulnerability in orderto protect the system wholly, which is animpossible task most of the time.

Whereas an attacker only has to find one.

They only have to find one. Yeah. Yeah.

And then there. And then they're in.

So how do you you know,how do you really makea secure system from the ground up?

Well, everybody's using open source.

You know, a couple of years ago,you know, Linux is huge.

It's in every systemthat's out there just about.

And but how do you secure itfrom the ground up?

And we kind of came to this realizationthat, boy, if we can'tif we can't trust the developerswho are developing the software,that'skind of the foundation of it, right?

Is we trust our developers,then how do we actually trustthe code that we would want to usefrom all of these different opensource packages?

And so that's kind of how we,

I guess, came to where we are today.

First, starting on embedded systemsecurity, just focused on application,then kind of broadening our view to how dowe secure the whole operating system.

And now really lookingat kind of fundamental trust issues.

And when I first started supplychain software, supply chain, I don'tthink was uttered once in any conversationthat we had and only recently. Right.

Have peoplereally started talking about it.

And I think it's just because there trulyis a software supply chain now.

There are,you know,just an enormous number of packagesand developers out therethat are all contributing both fromwithin organizationsand without organizationscreating this supply chain ofof people who develop systems,in some cases unknowingly.

Right. They're just developinga package elsewhere in the world.

They have no idea that it's being used inin really critical systems.

Well, that brings upsomething interesting that you saidyou have to trust your developers, Right?

So typically, I know when I got hired onand my first job, oh, my goodness, thethe vetting that they did for mewas outrageous.

I had psychological profile done.

I had background checks.

I had security checks done all this stuffso that the company knewwho they were bringing in and writingsoftware.

And specifically for me,it was for medical imagingwas was my first job out of college.

So they were ultra cautious.

But we just go and download someopen source package off the Internet.

Right? Right.

We don't know who wrote that stuff.

We have no ideaif they have a bent towards doingsomething malicious or nefarious.

Sure.

I mean, we go to to great lengths toto trust the peoplethat we bring into our organizations.

And then, you know,

I have developed a background.

What's the first thing you dois you go and try and find somebodywho's already done it.

And they areprobably not inside your organization.

Probably not yet.

Probably haven't been checked.

They're looked atand, you know, 99 pointwhatever percent of those peopleare probably developing open sourcefor the right reasons.

They're putting their code outthere. They're sharing it.

That's that's the amazing aspect of it.

But there's also kind of,you know, the fearthat can ruin it for the rest of us.

And and you don't knowyou just don't know whatyou know, what you're getting becauseyou haven't put anybody through that.

That type of.

So is that why you guys kind of shiftedyour focusfrom from protecting embedded systemsto operating to systemsto getting down to the core, which iscan I trust the person actually writingthe code?

Is that where that is?

That you got to that?

Yeah, it really is.

Because, I mean, still the ultimate focusis getting a system out therethat is secure and, you know, accomplisheswhatever missionthat that it needs to accomplish.

But as we're building it,you know, because we're pulling infrom all of these different places,software packages,you know, left, right, up and downfrom all over the world, Boy,if we can't if we can't trust thoseand, you know, our assertion is that youcan'tbecause you don't knowwho is who has worked on it,you got to at least look at it,make sure that you're not pulling in,you know, a problem, eitheran intentional. Or.

Or a maliciously inserted problem into,you know,what are really, really critical systemsto either, you know,national security or economics or whateverit may be.

Well, I mean, when when people argue,well, I test I test the software,right?

I've tested the software, the it passesall of all of the tests that I runto make sure there are no vulnerabilitiesor anything like that.

Or maybe you do a code review, Diego,line by line on open source code.

I don't know very many people that do. No.

I don't.

You know, we we talk about thata lot and in so many cases you just can't.

I mean, the Linux, the kernel,the Linux kernel alonehas almost it'smaybe exceeded 30 million lines of code.

That's amazing.

And if you have an enterprisedistribution,you know, something like RedHat,you're talking abouta hundred million lines of code or more.

It's just untenable to review all of that,you know, with human beings at least.

So you do you apply tools to itto try and seeif you can catch vulnerabilities

And those tools, they absolutelythey still have to remain in ourin our pipelines.

So we're not getting rid of those tools.

We still need those tools.

Absolutely. Yeah.

You know, we can identifyso many past vulnerabilities.

We don't want to see them again.

Right. As a developer,you copy and paste code.

You may unintentionally copya vulnerabilitythat's already beendetected in one packageand you're using it over here in another,a software assurance tool to be able toidentifywhat it really and and and root it out.

So so then why then why you even careabout who's contributing it

If these tools outthere are looking at things,what what danger is therein having a developer that's maliciousif I'm scanning their code anyway?

Yeah.

So the trouble is, I don't knowthe latest numbers off the top of my head,but there's some dozensof new vulnerabilities that are discoveredevery single day right.

And so that tells me and, you know, tellskind of the cybersecurity communityof experts that there's somethingthat we're not catching.

There's something that's that'sthat's in there that's getting in therethat we're not able to identify upfront.

Because if we could identify it upfront,we would root those out and you'd see,you know,

There's also, youknow, somemore nefarious things that could happen,you know, switches that that get flippedat the behest of some attackerwhere you don't know ifif there'sany kind of malicious code in thereuntil that switch is flippeduntil they've.

Oh, like a runtimes which even.

Yeah, something like that,

I think those are probably goingto be few and far between.

But these are the creative,creative things that you can dowith an attacker to,to try and inject malicious code.

I think the, you know, the realthe real point is if,if you don't knowwhere that code is coming from,which you don't,if you're just grabbing it from anywhere,then you don't really know what you'rewhat you're pulling in.

And given that there are so many newvulnerabilities every day,given that there are, you know, millions,millions of developers out there,we just don't know anything aboutin some very small percentage of themthat can kind of upset the applecartfor the rest of us.

It's worth taking a lookto make sure that, you know,the package that you pull in doesn'thave something that was somethingthat was injected malicious.

So you said something interesting.

There's there's of a bunch of usthat depend on a small number of packagesthat could affect lots of things.

A log forge is a great example rightwhen the log for J vulnerability came out,there was another package and I can't

I can't remember the nameoff the top of my headin the Node.js communitythat almost everyone depended on, eitherdirectly or indirectly depended on.

Oh, sure.

That was maintained by one person.

Right.

There's quite a few of those actually.

Yeah. So.

Whoa. Okay.

You know, when you think and, and,and it popped upbecause this one person was like,everyone's making money off of me.

Where's my, where's my take?

That happens a lot.

You know,

I even used to think of, of open source.

You think ofopen source as a community of peoplethat are all,you know, all eyes on this one packageand you're going to havesecurity experts in there, as wellas programing experts and memory experts.

And they're all going to point out,when you put this package out therein the open, they're going to point outwhere the problems are,

Hey, you're using memory wrong over here.

Hey, this is the vulnerabilitythat we've already seen before.

Don't program it that way.

At the end of the day, a huge, hugepercentage of packages are reallybeing developed and maintained by just oneor in some cases, a handful of people.

There's big packages for sure,like openness to sell a crypto packagebeing developed by hundreds of developersor the open Java development kit

I think has some 600plus active developers, but most packagesand when I say most,

I mean that double digit largepercentage of packagesare maintained by one.

I wanted to be.

Is that something, is that something thatwell, obviously we were concerned about?

Is that something that you can identifyeasily identify, Yeah.

Most of the time that information isis available thoughwho is actively contributing to it.

You just go on GitHubmost of the time, right.

Or get lab or wherever there or.

That's there and availableand even who is who is inactive,meaning they've contributed in the pastand they are no longer makingcontributions.

And so all of those can be kind ofinterestingpieces of data that you can,you can see a timeline of history.

You can start to understandhow supported or unsupporteda particular package is,and that may be a risk for your program.

If you have a really critical programthat you're working on and you have,you know, this one piece of software,no matter how small,that is just a critical piece to that,you know, Jengastack of blocks that you have in yourin your entire software stack,you don't want that thingfalling out from under youand causing the rest of it to topple.

So it's worth knowing how supported it is.

It's worth knowing a little bitabout where it comes from andand who's worked on it.

You know that

I just on whose worked on it too.

You couldbecause I know there's developersout there contributors to open sourcethat work on similar packagesat the lower levels or framework levels.

And theand these are very prolific programmersthat work across severaldifferent packages at the same time.

Sure. Right. They're contributing.

I think it would be fascinating toto take a software packageand break it down and see the numberof contributors you have an open sourceand then see who your biggest contributoris of the full softwarestack by number of packagesor lines of code, whatever.

I think that would be fascinatingto look at because you could easilysee who you're mostly dependenton an individual, right?

Right. That's right.

Yeah.

So and so when you say prolific,some people are we havewe have visualizations that we'vewe've created by looking at thisin different ways that showsome people are making,you know, tens of thousandsor in some caseshundreds of thousands of contributions.

It could be as something as small as,you know, a change to a characteror line of code or somethingas big as a check in of a function.

But tens of thousands ofof changes, additions or removalsto open source packages,they really, truly are prolific.

And it's fascinating to see,you know, how the whole communitykind of comes togetherand then certain people who are, you know,they're kind of the whalesfor for a particular packageare really influencing things.

Wow, that'sneat.

Now, here's another question.

What motivates me because there'ssome motivation behind developer.

I'm a softwaredeveloper to Michael's, so, you know,and I've contribute to open source.

What's that motivationfor someone to spendthat much time and effort?

But I mean, I can speak from experience iswhen I get a problem,you know, a technical problemthat I havethat I can solve with with programing.

I just can't stop until I solve it.

And and I think that's part of it.

There's definitely a tremendousfeeling of satisfactionthat you get from open sourceand contributing back to the communityby by seeing a lot of peopleuse the package that you've created.

You're right.

When I've when I've written a packageor something and and the number goesabove 10,000, you're like, yes,look how cool, right?

It's like I'm contributing to society,right?

People are downloading my stuff.

It's awesome. That's great.

There's, you know,the unfortunate side of that Very same

I think I think that very sameset of motivations can alsocontribute to theattacker side of things as well, where,oh, look at this log for Jay.

For example, you mentioned earliermillions and millionsof instances of that packageout there in the wild.

I bet they were head over heels, happyabout how widely spread.

How are they? Oh, yeah.

That particular vulnerabilitywas so in fixing it, you know,was one of the most important thingsthat we did quicklybecause of how widespread that that was.

Yeah, that's that's crazy.

So tell me a little bit about what

Dark Sky Technologyis doing in this space, because we kind oftouched a little bit on it.

So if if I wanted your guys's services,what what sort of helpcould you give me to see my open sourcevulnerabilities?

I'm going to call it maybe it'snot vulnerabilities, maybe it's exposure.

Yeah, right.

Yeah.

A risk or a measurement of trust.

Measurement of trust.

I like that even better. Yeah.

You know, that'smaybe best illustrated with an example.

There was a package last year.

You can look it up.

It's there was a Reuters articlethat came out about this packagecalled Push, Push and Push, whichis kind of a sophisticatednotification package if you wanted to to,you know, put notificationsin an application that you were developingand you didn't want to do thatyourself, you know, you could integrate,push, push in there.

Well, the Army and the CDC had integratedpush into multiple applications.

I think the CDC said seven or eightthat they had put itin, not sure how many for the Army,but they had they had put this in thereand they were using it.

And by the looks of it,push, push was an American companyand they were, you know, American runheadquartered, I think in Californiawith offices in Maryland and andand somewhere else in the United States.

Come to find out they were actually notheadquartered in America.

They were headquartered in Moscow.

They're paying taxes in Moscow.

They had developers in Siberiaand some in Thailand as well.

And soyou don't even haveto know if there is a single vulnerabilityin that package to immediatelyknow that you want to remove it. And.

That you're in youryour systems and software.

And that's what the Army Corpsand the CDC did is they removed it.

And that was what led to that Reutersarticle.

Holy smokes. Right.

This potentially Russian influencein our in our applicationsthat we had to get rid of because.

Well, that's that's really interestingbecause I could see this is athe open source community could be easilyinfiltrated by nation state.

Bad actors. Sure.

You know.

And we all know some of the bestprogrammers in the world are coming out ofout of Russia, Estonia.

They've got great programmersin that area.

So I would I would guessthat they are contributing open source.

Yeah, I think it's safe to saythat any avenue that of,you know, foreign state actorcould use to infiltrateor get some sort of meaningful advantageover a technology thatthey're going to do it for in leverageand you have to assume that, right.

That's the the game that you're playing.

So push, push.

You know, the first thing that we didwhen we saw that article is we'veplugged it into our own toolsbecause a lot of the tools that we'vedeveloped are aimed at automaticallydetecting trust issues like that.

And so we plugged it in, tools went out,did their analysis, they lookedat different sources of intelligencethat are out there, opensources of intelligence,and made a determination that,holy smokes, you know,there is an enormous effortinside of Russia to developthis particular push,push application around a package.

And there is no developmentout of the United States.

And so that's a red flag.

You know, right away you're telling meyou're an American company,but you don't have a singlepieceof development happening out of there.

So it was able to detectsort of trust issues like that.

And raise the warning flag and say,hey, I'm not so sure about this.

Open source is developedall over the world.

It doesn't meanthere's anything wrong with that.

There's like you said, there'splenty of very good Russian developersor any nationality for that matter,who are contributing to open source.

But when you say you're one thingand you're actually another.

That's an issue. For it.

That's that's potentially.

So what what other what other trust issuesdo you guys evaluate?

I mean, location,of course, location is going to be builtinto the Internet trust,whether we like it or not.

Right.

The federal Department of Defenseis not going to use open sourcethat was developed in China or Russiaright now or North Korea.

Sure. They're just not right.

So that's one trust.

What other do you score these?

How do you evaluate a contributor?

It's it'sreally it's really quite difficult.

And we, in fact,do not want to be in the businessof scoring because every programis going to be a little bit different.

Okay. You're going to andthere's going to be some absolute nos.

Right.

If you have a developerfrom a sanctioned company or country.

That you.

In that right away.

The rest of it,though, can be quite a bit, you know, grayso to say, you've got some programsthat are going to sayabsolutely no to this country,you know, having any development influenceor absolutely no to this company,having any sort of development influence.

And you've got other programsthat, you know, they don't care.

They're not as they're not as sensitive.

And so it really is up to thethe program or the business unitor the company itselfto determine what their businessand security requirements are.

I think the government is is coming aroundis you've probably seen recentlywith the executive order and thethe follow up memorandum about softwarebuilding materials is really a pushfor software supply chain security.

And in that memorandumthey even mention having a risk framework,and it wasn't particularly well definedwith that risk framework is I think we'regoing to get to a nice definition here,you know, in the coming months or year.

Once that definition happens,we might be able to say, all right,there's some mix, there's somequantitative things that we can do here toalert ourselves ifwe run into these issuesthis country is and now this you know,somebody who's contributedmultiple CDs is a no.

Yeah, I was going to ask that.

I mean to you can what attributes do youcan you report on not necessarily scoringbut you are reporting on attributes likethis guy contributedor this programmer contributedand it resulted in his CVand that happens a lot.

Like if you ran it on me,it might say, you know, Darren has a 25%,you know, CV generator right on the CV

Generally, that would be bad, right?

I don't want to use any of his codeanymore because he writes crappy code.

Right. Right, right.

Yeah.

Can you get.

Down to that level where I can look at,hey, I, you know, correlation between

Darren checking in code into open sourceand CVS coming up on his code, right?

I mean, you can get to the.

Level of

I don't knowif there's anybody out there doing it,but multiple CVS would certainly be aa fine again, could it be intentionalor not intentional?

Right.

Like I said, youcopy paste code all the time.

If there's a piece of codeyou're using over and over.

I charge you. But he wrote it for me.

That's what I'm going to apply.

Yeah, that's right.

It wasn't my fault.

It was my fault. You.

He did it.

What about goingbeyond just programing, right?

What about looking in?

Because they do this. They.

They do this all the time with employees.

They.

They look at financial records.

They look at, right?

Mm hmm. Yeah.

You know, I don't know if companiesare still doing drug testing or not.

Obviously,you can't do a virtual drug test.

Sure.

But you can look at criminal records.

You can look at a whole bunch of things.

Is that something that you see valuable?

Is that something you guys can doas well as reachinto the public side of things as well?

I mean.

Yeah, if you think aboutif you think about, you know,just the profile, the Internet profileon yourself for if I think about the

Internet profile on myself and I have a

I have a LinkedIn profile, I'm on Reddit,

I'm on GitHub, and and you start tolook at all the different piecesof information that are available.

What skills do I have?

You know, I don't I have some work historyin the field of cryptography,but I probably shouldn'tbe developing any crypto algorithms.

I definitely shouldn'tbe developing any kernel level, you.

Know, got. Linux drivers, for example.

So I might be able to look at thatand say, well, this person is kind of outof their element.

And the quality of the codethat they write is pretty lowand they're associatedwith this malicious website over here.

And you start to, to build up and say, I,

I want to look at what they're.

What they're writing, right?

I got to make.

Sure I can't review 30 million linesof of Linux kernel code,but I can look at their, you know, adozen lines that they've contributed and,you know, just make sure that everythinglooks okay and and say, yeah, thumbs up.

We have a littleflag here, but we're going towe're going to swipe that awaybecause it looks like it's okay.

We've had eyes on it.

So you're not just lookingat my contributions.

You're also looking you're doinga background check on open sourcedevelopers.

You could say thatwe're in, in, in for the purposes ofof finding those developerswho would intend to create malicious harm.

Well, yeah, some of the companies do that.

Companiesdo that when they hear someone write.

You have to.

Yeah, yeah, yeah, yeah. And right.

I've had background checks done on meand rightly so.

And, and if the first thing that I didwas step in the doorof a companythat now trusts me because of thatbackground check and the referencesthat they've talked to youand I start pulling in codefrom all over the placeand they don't know where that came from.

I wasn't the one that wrote it.

Somebody else? Yeah.

Their idea of where it's coming from.

Now, that's, that's somewhat scary.

So yeah,if people want to find out more aboutthis wholeconcept, I love trusting open sourcecontributors.

Right? Does that just kind ofwhere do they find out more informationand where can they contact you to engageif if this is a concern of theirs?

Yeah, we're on dark sky technology.

Com is our is our home page.

We spend a lot of time on LinkedIn.

That's a great way to direct messages.

We search dark sky technologyso you'll find us and you cansend us a message or interactwith our content.

We're we're really big onyou know kind of spread the word and andjust opening upthe things that we've learned aboutcybersecurity and reverse engineeringover the past 20 years,sharing it with the community and,you know, trying to helpbring awarenessand understanding that opensource is a phenomenally great thingthat's just done.

Amazing.

It's doneamazing things in technology space.

But we shouldn't just, you know, blindlyjust just grab itand put it into our systems, especiallyfor those really critical systemsthat are either,you know, supporting our warfighteror maybe responsible for, you know.

Health care, financial.

I mean, yeah, the list is long, right?

Right. So, well.

Yeah that would be a great placeis. Yeah. Yeah.

Michael, thank you for coming on the show.

This is this is exciting.

This is interesting to me right there.

A very interesting topic.

It's very interesting.

It's it's so much fun to to be engaged,you know, kind of at the root levelof of of trust.

Right.

Because if we can build thattrust from the ground up,then we can finally get to systemsthat that we trust.

And, you know, we can send outknowing that they're going to dowhat they're meant to doand and not be compromised by somebodywho has different intentions for us.

Yeah, absolutely.

Again, Michael,thanks for coming on the show.

Thanks for having me.

Thank you for listeningto Embracing Digital Transformation today.

If you enjoyed our podcast,give it five stars on your favoritepodcasting site or YouTube channel.

You can find out more informationabout embracing digital transformationand embracing digital, dawg.

Until next time.

Go out and do something wonderful.