What is inside a package, and how to verify and avoid malicious packages

Presenter: Kevin Griffin of SwiftKick Training, and an Inedo-certified Master Trainer

Note: The following text is a transcript of the video, with minor edits for readability.

A package is just a container that has other files inside of it.

The fundamental thing that a package needs, is that it has to contain everything that is required for a package. If that’s an executable, a series of libraries or something else, they all need to be inside the package. Some packages might need a script to help configure them or configure the system to use them. I have installed packages that run PowerShell scripts, or might have to run a batch script of some sort, and that’s just to set a couple environment variables. It doesn’t have to be complicated, but if stuff is inside the package, the package can run it automatically, and do the work that is necessary to use it.

You might have supporting content sometimes, like images and stylesheets, if you’re a developer a lot of times it’s just configuration files or sample configuration files.

All of this isn’t useful unless there is a manifest that tells the package installer what’s in the package and what to do with it.

This will also really depend on what packaging system you’re using. Nuget does it differently than NPM, NPM does it different than say pip, we could spend a lot of time going down all the different rabbit holes.

Actually, if you need to know more about a specific packaging system, you should really let us know because we have training set up for all these different packaging types

So the manifest is important.

It’s the thing that tells the installer, “here’s what we got, here’s what we need to do, please go do it.”

Examples

Here are two examples of manifests. One is with NuGet, using what’s called a NuSpec (that’s just a fancy word for manifest) and then NPM uses what’s called the Package.json file to define all it’s metadata.

Here’s a simple example with NuGet or “NuSpec”

The manifest just needs to tell the packaging system, the feed, what it needs to know about your package and it doesn’t have to be complicated.

Here we’re just telling it what’s our unique identifier, Newtonsoft.Json — that’s what a person would type into the feed search if they were looking for this package specifically.

Next, what’s the current version of the package, we’ll talk a little more about versioning in a different video, but if I put a version in here, in SemVer, the feed can read those. Then, it can automatically classify if this package is newer or older than another package

We also have titles and authors (who’s the owner of the package), this is useful metadata if you’re searching for a feed, and you might want to know who built the package, or who’s responsible for the package development.

NuGet has a special condition for accepting license agreements, there’s also links for licenses and projects. There’s an icon url so if you want to have a special icon in the feed, you can put all that in there too.

We could keep going down the list, but there’s a series of tags in here that help identify what makes this package Newtonsoft.json.

Here’s an example of the NuSpec file’s dependencies

Depending on where I’m installing a NuGet package, it might need a couple different dependencies.

What’s smart about how NuGet does this installation, is that it knows where it is installing the package, so it can look for the proper set of dependencies.

NPM is famous for having complicated manifests

It’s not so much that they’re complicated, it’s that they’re word-y and you think you need one thing, but you really need three dozen things.

In this case, I’ve had to break up the dependency list for a simple application that we run in production. The application doesn’t really do too much, it’s just a basic UI, but we have a plethora of dependencies that come along with this project.

NPM breaks it up into two different types. There’s dependencies and there’s development dependencies (I’m not going to go down that rabbit hole), NPM says here’s everything that you need to know about this project.

It has information in there for name and for version, and we can also designate scripts to say, “ok, when we’re running a production build we need to run this script, if we’re running a development build, then that script, if we just need to clean everything up, here’s a third script to run.”

So, the big difference between NuGet and NPM is just how they’re structured, but fundamentally they’re doing the exact same thing.

Package Verification

If you’re installing packages, and you’re installing them from what you should consider untrusted sources, even if they’re the official public repositories that are available out there on the internet, you should have a grain of salt when you’re installing those packages.

You don’t necessarily know if the package you’re downloading is what you think it is. There have been several cases over the past couple years where people have been able to push up packages that contain malformed code, or they were trying to do something malicious. The feed developers have worked continuously to try to prevent this, and verification is one of those steps.

The real solution, if you’re worried about this problem, is that you need to run your own internal feed management system. You need a tool like ProGet that just eliminates that guess work because you’re in full control of what packages are available to everyone out there.

Going back to package verification, most package managers will use hashes of the file contents to determine this version vs that version. Here’s an example using NPM where there’s a sha1 hash of the file contents, and when you install that package, you basically get a thumbprint at installation time.

You need to check your current installed version of the package, vs what the version in the feed is, vs what your manifest says you should have.

You know, or you can trust that, if something were to change, in any of those locations, you’re running the known good version of the package. Again, this isn’t a complete failsafe, but it’s a next step towards making sure that your installing the packages that you think that you’re installing.

Next Training Snippet   ➔

Customized Training

Our training courses are built modularly, and we can develop a customized training roadmap for your organization, so that everyone gets the training they need, when they need it.