Hey there DevOps wizards, code gurus, and IT maestros! Ever found yourself buried in a mountain of code dependencies? Worried about staying on the right side of open-source licenses? We've got you covered. Let's talk about how you can automate license checks in your CI pipelines. 🚀
Why Should You Care?
Like many of you, we're big fans of open-source components. But hey, we also want to play nice and respect everyone's rights. So we decided to auto-approve certain “safe” licenses like Apache 2 and MIT, and review any other license types we bump into “on the go,” making approvals or rejecting to the particular packages. Sound like you? Then read on!
The Setup: GitHub Actions
We're using GitHub Actions as our go-to CI tool. Don't worry, the process is straightforward—just a bit of Python and some GitHub magic. We're making a reusable workflow for our many microservices, and we're storing them in our GHA-Store repo.
Interested in scaling GitHub Actions? Subscribe to our newsletter (on the right); we've got more on that in upcoming posts.
Your First Step: Reusable Workflow File
Here's a snippet for initiating the reusable workflow:
on:
workflow_call:
inputs:
inherit-inputs:
required: false
type: string
env:
CHECKOUT_REF: ${{ fromJSON(inputs.inherit-inputs).ref || github.event.pull_request.head.sha }}
Nothing too crazy, right?
This sets up the workflow to be reusable and passes down inputs from the parent workflow to the child. As reusable workflow can't take all parent's inputs by default, we pack it into json and pass it as an argument to this workflow.
In CHECKOUT_REF you can find how we get the value back, I’ll show you the packaging at the end of this post, where we will call this workflow.
Let's Get to Work: The Main Job
1. Fetch the Source Code and License Config
Your job starts with fetching your repo and a separate repo containing your license configuration. Here's the code:
jobs:
deps-lic-check:
runs-on: ubuntu-latest
steps:
- name: Generate token
id: generate_token
uses: tibdex/github-app-token@v1
with:
app_id: ${{ secrets.APP_ID }}
private_key: ${{ secrets.PRIVATE_KEY }}
- name: Checkout current repo
uses: actions/checkout@v4
with:
ref: ${{ env.CHECKOUT_REF }}
fetch-depth: 1
- name: Checkout configuration
uses: actions/checkout@v3
with:
repository: perfectscale-io/gha-store # replace to your org/repo
ref: v1
token: ${{ steps.generate_token.outputs.token }}
path: gha-store
fetch-depth: 1
Regarding "Checkout current repo" everything is simple, but pay attention to "ref". We use the to ensure that the PR will get Commit from PR because the default behavior on PR is to checkout ephemeral commits of merging PR to the target branch.
To access configuration, we need to checkout another repo (inside our org). To do so, we created a Github Application with a specific set of permissions.
So, as result, we have the next working directory structure:
pkg/
main.go
go.mod
go.sum
gha-store/
2. Setting Up Go and Vendor
If you're dealing with a Golang repo, you'll need to set up Go and Govendor like so:
- uses: actions/setup-go@v4
with:
go-version: 'stable'
- name: Setup vendor
run: |
go env -w GOPRIVATE="github.com/perfectscale-io" # replace to your org
git config --global url." "
env:
TOKEN: x-access-token:${{steps.generate_token.outputs.token}}
Here we setup “go” and again reuse the token from the "Generate token" step.
If you don't have private dependencies, you can omit the "Setup vendor" step.
3. Generate Software Bill of Materials (SBOM)
Why SBOM? It makes it easier to list all your dependencies.
To do it with Golang we will use "CycloneDX/gh-gomod-generate-sbom@v2" from CycloneDX.
Note, we used OWASP CycloneDX as it is a full-stack Bill of Materials (BOM) standard that provides advanced supply chain capabilities for cyber risk reduction. It is managed by the CycloneDX Core Working Group, is backed by the OWASP Foundation‍
As an output, it will generate a JSON file with all our dependencies.
Take a look:
- name: Generate SBOM
uses: CycloneDX/gh-gomod-generate-sbom@v2
with:
version: v1
args: app -licenses=true -assert-licenses=true -json=true -output sbom.json .
By using SBOM as a standard to get dependencies information, we can avoid further refactoring to support multiple languages and make our code simple.
The only thing left is to parse this SBOM and compare dependencies licenses with the allowed list.
To do it, we will run a simple Python script which we will run directly in GitHub actions step using "shell: python"
But before we start, let's look at our "allowed-licenses.yaml" (or "configuration") file:
---
# array of allowed licenses
allowed:
- MIT
- Apache-2.0
# array of package names to ignore during check
ignore:
- github.com/golang/protobuf
Here we have 2 arrays:
- allowed - array ([]string) of allowed licenses
- ignore - array ([]string) of packages which we ignore and allow any license for them
In this configuration file we store in our gha-store repo, folder "helpers" and are able to update it as often as we want. Every new workflow we run will take an updated configuration.
4. Python Magic: Analyzing Licenses
Now, the real fun begins! We've written a Python script that does all the heavy lifting.
- name: Analyze licenses
shell: python
run: |
import sys
import json
import yaml
def generate_markdown_table(non_allowed_licenses):
table = "| License ID | Packages |\n"
table += "|------------|----------|\n"
for license_id, packages in non_allowed_licenses.items():
table += f"| {license_id} | {', '.join(packages)} |\n"
return table
def extract_licenses(json_data):
license_package_mapping = {}
for component in json_data['components']:
package_name = component['name']
licenses = component.get('licenses', [])
for lic in licenses:
license_id = lic['license']['id']
license_package_mapping.setdefault(
license_id, []).append(package_name)
return license_package_mapping
def check_licenses(license_package_mapping, allowed_licenses, ignored_packages):
non_allowed_licenses = {}
for license_id, packages in license_package_mapping.items():
if license_id not in allowed_licenses:
non_ignored_packages = [
pkg for pkg in packages if pkg not in ignored_packages]
if non_ignored_packages:
non_allowed_licenses[license_id] = non_ignored_packages
return non_allowed_licenses
json_data = json.load(open('sbom.json'))
with open('gha-store/helpers/allowed-licenses.yaml', 'r') as f:
yaml_data = yaml.safe_load(f)
extracted_licenses = extract_licenses(json_data)
allowed_licenses = yaml_data['allowed']
ignored_packages = yaml_data['ignore']
non_allowed_licenses = check_licenses(
extracted_licenses, allowed_licenses, ignored_packages)
if non_allowed_licenses:
print("Non-allowed licenses and their packages:")
for license_id, packages in non_allowed_licenses.items():
print(f"License: {license_id}, Packages: {', '.join(packages)}")
markdown_table = generate_markdown_table(non_allowed_licenses)
with open("result.md", "w") as f:
f.write("# Non-Allowed Licenses and Their Packages\n")
f.write(markdown_table)
sys.exit(1)
else:
with open("result.md", "w") as f:
f.write("#No problems with Licenses\n")
print("All licenses are allowed.")
sys.exit(0)
Here we compare dependencies licenses from SBOM with the allowed licenses and log everything that doesn't have the proper license and is not ignored.
Also, we generate a markdown table with the "failed" packages.
5. Keeping It User-Friendly: GitHub Actions Summary
We want the output to be as developer-friendly as possible, so we use GitHub Actions Summary:
- name: Update summary
if: always()
run: echo -e "$(cat result.md)" >> $GITHUB_STEP_SUMMARY
No magic, just GitHub Actions.
6. Don't Let It Slide: Slack Notifications
If anything goes south, you'll get a Slack notification. Trust me, you want to set this up!
- name: Notify about deploy
if: failure()
uses: slackapi/slack-github-action@v1.23.0
with:
payload: |
{"text": "License issue in ${{ github.event.repository.name }}\nRef: ${{ github.ref_name }}\nWorkflow url: ${{ github.server_url }}/${{ github.repository }}/actions/runs/${{ github.run_id }}"}
env:
SLACK_WEBHOOK_URL: ${{ secrets.GLOBAL_SLACK_DEPENDENCY_LICENSE }}
SLACK_WEBHOOK_TYPE: INCOMING_WEBHOOK
To setup, please create a new Slack application and add it to your workspace. Then add a new WebHook and save it as a global secret, for example, "GLOBAL_SLACK_DEPENDENCY_LICENSE" as we do.
You can read more about it in Slack documentation: https://api.slack.com/messaging/webhooks
Call me ...
We finished our Reusable workflow, now we need to call it somehow:
# Perfectscale-Go-Worfklow.yaml
name: PSC Golang Workflow
on:
push:
branches:
- 'main'
pull_request:
workflow_dispatch:
inputs:
ref:
type: string
description: 'Optional: Commit, branch etc.'
required: false
permissions:
checks: write
id-token: write
jobs:
deps-lic-check:
uses: perfectscale-io/gha-store/.github/workflows/_deps_lic_check.yaml@v2
with:
inherit-inputs: ${{ toJSON(inputs) }}
secrets: inherit
As promised example of how to pass parent's inputs to child workflow. We just call this Reusable workflow to check our licenses.
Bringing It All Together
And there you have it, folks! Automating license checks is easier than you thought. With just a bit of Python and GitHub Actions, you're all set to ensure you're not stepping on any legal landmines.
So, go ahead, give it a try, and let us know how it goes. Happy coding! 🎉
Full reusable workflow:
on:
workflow_call:
inputs:
inherit-inputs:
required: false
type: string
env:
CHECKOUT_REF: ${{ fromJSON(inputs.inherit-inputs).ref || github.event.pull_request.head.sha }}
jobs:
deps-lic-check:
runs-on: ubuntu-latest
steps:
- name: Generate token
id: generate_token
uses: tibdex/github-app-token@v1
with:
app_id: ${{ secrets.APP_ID }}
private_key: ${{ secrets.PRIVATE_KEY }}
- uses: actions/checkout@v4
with:
ref: ${{ env.CHECKOUT_REF }}
fetch-depth: 1
- uses: actions/checkout@v3
with:
repository: perfectscale-io/gha-store
ref: v2
token: ${{ steps.generate_token.outputs.token }}
path: gha-store
fetch-depth: 1
- uses: actions/setup-go@v4
with:
go-version: 'stable'
- name: Setup vendor
run: |
go env -w GOPRIVATE="github.com/perfectscale-io"
git config --global url." "
env:
TOKEN: x-access-token:${{steps.generate_token.outputs.token}}
- name: Generate SBOM
uses: CycloneDX/gh-gomod-generate-sbom@v2
with:
version: v1
args: app -licenses=true -assert-licenses=true -json=true -output sbom.json .
- name: Analyze licenses
shell: python
run: |
import sys
import json
import yaml
def generate_markdown_table(non_allowed_licenses):
table = "| License ID | Packages |\n"
table += "|------------|----------|\n"
for license_id, packages in non_allowed_licenses.items():
table += f"| {license_id} | {', '.join(packages)} |\n"
return table
def extract_licenses(json_data):
license_package_mapping = {}
for component in json_data['components']:
package_name = component['name']
licenses = component.get('licenses', [])
for lic in licenses:
license_id = lic['license']['id']
license_package_mapping.setdefault(
license_id, []).append(package_name)
return license_package_mapping
def check_licenses(license_package_mapping, allowed_licenses, ignored_packages):
non_allowed_licenses = {}
for license_id, packages in license_package_mapping.items():
if license_id not in allowed_licenses:
non_ignored_packages = [
pkg for pkg in packages if pkg not in ignored_packages]
if non_ignored_packages:
non_allowed_licenses[license_id] = non_ignored_packages
return non_allowed_licenses
json_data = json.load(open('sbom.json'))
# yaml_data = yaml.load(open('allowed-licenses.yaml'))
with open('gha-store/helpers/allowed-licenses.yaml', 'r') as f:
yaml_data = yaml.safe_load(f)
extracted_licenses = extract_licenses(json_data)
allowed_licenses = yaml_data['allowed']
ignored_packages = yaml_data['ignore']
non_allowed_licenses = check_licenses(
extracted_licenses, allowed_licenses, ignored_packages)
if non_allowed_licenses:
print("Non-allowed licenses and their packages:")
for license_id, packages in non_allowed_licenses.items():
print(f"License: {license_id}, Packages: {', '.join(packages)}")
markdown_table = generate_markdown_table(non_allowed_licenses)
with open("result.md", "w") as f:
f.write("# Non-Allowed Licenses and Their Packages\n")
f.write(markdown_table)
sys.exit(1)
else:
with open("result.md", "w") as f:
f.write("#No problems with Licenses\n")
print("All licenses are allowed.")
sys.exit(0)
- name: Update summary
if: always()
run: echo -e "$(cat result.md)" >> $GITHUB_STEP_SUMMARY
- name: Notify about deploy
if: failure()
uses: slackapi/slack-github-action@v1.23.0
with:
payload: |
{"text": "License issue in ${{ github.event.repository.name }}\nRef: ${{ github.ref_name }}\nWorkflow url: ${{ github.server_url }}/${{ github.repository }}/actions/runs/${{ github.run_id }}"}
env:
SLACK_WEBHOOK_URL: ${{ secrets.GLOBAL_SLACK_DEPENDENCY_LICENSE }}
SLACK_WEBHOOK_TYPE: INCOMING_WEBHOOK
Common Open source software (OSS) libraries licenses
Open source software (OSS) libraries typically use a variety of licenses, but some are more common than others. Always read and understand the terms of a license before using or contributing to a project.
|
MIT License | Permissive | Permissive | Commercial and non-commercial projects, proprietary software. | Include the original license and copyright notice in any copy or substantial portion of the software. |
GNU GPL (v2 or v3) | Copyleft | Restrictive | Free and open source projects, may impact commercial use and proprietary software. | Distribute derivative works under the same license. Provide access to the source code of the software when distributing binaries. |
Apache License | Permissive | Permissive | Commercial and non-commercial projects, widely used in corporate environments. | Include the original copyright, license, and notice in any copy or substantial portion of the software. |
BSD License (2 or 3-Clause) | Permissive | Permissive | Commercial and non-commercial projects, allows for use in proprietary software. | Include the original copyright, license, and disclaimer in any copy or substantial portion of the software. |
Creative Commons | Various | Varies | Not typically used for software, but for creative works like images, music, etc. Usage varies based on specific Creative Commons license chosen. | Varied obligations depending on the specific Creative Commons license (e.g., attribution, non-commercial use, share-alike). |
Mozilla Public License (MPL) | Copyleft | Partially Restrictive | Free and open source projects, allows for proprietary derivative works. Mozilla products often use this license. | Distribute any derivative works under the MPL, provide access to the source code of derivative works, include an original copy of the MPL with any substantial portions. |
‍