On August 31 2021 docker suddenly became a paid service.
It caught every company and especially DevOps team by surprise.
Who love paying for images , right ?
When come to think about it they are right to start charging a monthly fee for companies. After all they are providing a service.
One of the main shortcomings of this , is the limit it has to download the images from the same source. When your organization get hit the following its a real headache for the DevOps team:
Failed to pull image "busybox:latest":. rpc
error: code = Unknown desc = Error response
from daemon: toomanyrequests: You have
reached your pull rate limit. You may increase
the limit by authenticating and upgrading:
https://www.docker.com/increase-rate-limit
SonarQube
Sonarqube is an automatic code review tool. We use it as part of our development pipelines to identify bugs and vulnerabilities in our code.
For a few years SQ is the top 10 tools every DevOps should know as part of his/her toolbox.
To implement SonarQube as part of your pipeline you should use Sonarscanner:
https://docs.sonarqube.org/latest/analyzing-source-code/scanners/sonarscanner/
Sonar scanner has an extension to Azure DevOps , Jenkins , and GitHub Actions.
In this article we will cover GitHub action
What is the issue ?
See the below diagram
Every time we will do the build , usually from multiple branches , we will call Docker Hub for downloading our base image.
The API rate limit for Docker hub for anonymous users is set to 100 pulls per 6 hours per IP address
The basic logic tells us : Why not do a subscription ?
For authenticated users, it’s 200 pulls per 6 hour period and users with a paid Docker subscription get up to 5000 pulls per day.
Sounds like a lot of api call to consume but let’s do the math:
~400 repos for a mid level org X 3 branch to do the build from ( limiting for sake of argument ) X 5 times a day doing a build = 6000 calls.
Now not all of them are built on a daily basis but it’s a real possibility that this limit will be reached from time to time.
Another question that may hit you is : why not switching an IP on every runners ?
Well the simple answers is : GitHub hosted runners is already doing that for you !
Every user building using GHA know the following:
runs-on: ubuntu-latest
GHA will allocate a server for you , linux based [ Ubuntu as default ] for your code to run.
Since you don't have any control on the runners routing and location you will get a different IP on every run so the rate limit will not apply.
Using self-hosted runners
Many organization want to avoid the above and use their own self hosted runners running “locally”. You can refer more about it here
Self hosted runners has it’s share of advantages but also some noticeable disadvantages.
In reference to the above, the “local” solution that we chose was running them as kubernetes pods on our EKS for various reasons.
With that , all of our traffic came from the same IP ranges causing us from time to time to hit that limit especially considering the following when applying the scan action:
Official SonarQube Scan
Using this GitHub Action, scan your code with SonarQube to detects Bugs, Vulnerabilities and Code Smells in up to 27 programming languages!
Using the scan action is very easy to implement:
- name: SonarQube Scan
uses: sonarsource/sonarqube-scan-action@master
env:
SONAR_TOKEN: ${{ secrets.SONAR_TOKEN }}
SONAR_HOST_URL: ${{ secrets.SONAR_HOST_URL }}
All you need to provide is the host URL and the token generated for this action from SonarQube.
Behind the scene GHA will download the following repo:
https://github.com/SonarSource/sonarqube-scan-action
and build the scanner image using the Dockerfile:
FROM sonarsource/sonar-scanner-cli:4.8
LABEL version="1.2.0" \
repository="https://github.com/sonarsource/sonarqube-scan-action" \
homepage="https://github.com/sonarsource/sonarqube-scan-action" \
maintainer="SonarSource" \
com.github.actions.name="SonarQube Scan" \
com.github.actions.description="Scan your code with SonarQube to detect Bugs, Vulnerabilities and Code Smells in up to 27 programming languages!" \
com.github.actions.icon="check" \
com.github.actions.color="green"
COPY entrypoint.sh /entrypoint.sh
RUN chmod +x /entrypoint.sh
COPY cleanup.sh /cleanup.sh
RUN chmod +x /cleanup.sh
ENTRYPOINT ["/entrypoint.sh"]
As you can see in the first line we download : sonarsource/sonar-scanner-cli:4.8
That will lead GHA to : https://hub.docker.com/r/sonarsource/sonar-scanner-cli to get the image.
That additional step is doubling every build we have with calls to Docker Hub causing the limit to be a very imminent threat to our ongoing developing process.
So what can be done ?
With a few simple actions you can eliminate the need for the above additional call
GHA allow you to use nested actions so just follow the below guidelines:
1. Download the CLI image from Docker Hub and push it to your registry.
2. Clone the sonarqube-scan-action repo to your own ORG
3. Modify the Dockerfile in the clone repo to point to the new registry from step 1
Now you are ready to build your actions:
on:
# Trigger analysis when pushing to your main branches, and when creating a pull request.
push:
branches:
- main
- master
- develop
- 'releases/**'
name: Main Workflow
jobs:
sonarqube:
runs-on: [self-hosted, linux, x64]
steps:
- name: Checkout sonarqube-scan-action
uses: actions/checkout@v3
with:
repository: my-org/sonarqube-scan-action
token: ${{ secrets.GH_PAT }} # `GH_PAT` is a secret that contains your PAT
- name: SonarQube Scan
uses: my-org/sonarqube-scan-action@master
env:
SONAR_TOKEN: ${{ secrets.SONAR_TOKEN }}
SONAR_HOST_URL: ${{ secrets.SONAR_HOST_URL }}
Notice the main change here that we are cloning:
- name: Checkout sonarqube-scan-action
uses: actions/checkout@v3
with:
repository: my-org/sonarqube-scan-action
token: ${{ secrets.GH_PAT }} # `GH_PAT` is a secret that contains your PAT
and then using that repo to trigger the scan.
The above Flow will now be:
In conclusion and next steps
A report from 2022 state that :
Docker Hub repositories hide over 1,650 malicious containers
So using the ‘officials’ and DevOps/DevSecOps approved images is a must step to be used in every organisations.
Malicious code could be injected to the public registry by AWS / AZURE / GCP but the scan they run are much more meticulous.
What about the Images used by the applications?
Good question !!
part of moving to self hosted runners is to move ALL the base images to “local” registry and avoid using Docker Hub in our clusters at all.
For future release just repeat the above steps and make sure to change the Dockerfile accordingly