Task automation with minimal resources

Throwing resources at a problem is rarely the right solution. Having a very powerful server waiting to process updates you might do once a week basically translates to wasting money. Therefore, in the following article we will try to find an alternative.

Introduction

Using the idea behind the Producer/Consumer pattern, we could use a normal web page to store any requests that might come towards our webhook and start any services we what, whenever hardware is available.

The following article will describe a possible solution, using Jenkins and some custom small projects. We will not actually code the solution, so the languages are 100% up to the reader’s choice, but the recommendation is for PHP on the producer side (since it is very common with hosting providers and very easy to properly setup) and Typescript for the consumer side (since Typescript is quite a nice and trendy nowadays).

So a more clear picture on what we want to do:

Create a PHP+MySQL application which will store all the valid incoming requests;
Extend this application to serve via an API the requests stored and make sure we have some sort of caching implemented;
Setup Jenkins using inside a container;
Create the consumer – a small typescript application which will periodically query the PHP app, to see if any new requests have been added and if so, retrieve them and pass them as jobs to Jenkins;
Establish a way to test that everything functions as expected.

Benefits

We will be able to use basically any hosting service to host your queue manager (the PHP application);
Since you don’t have to expose the end consumer software (in our case it will be the Jenkins automation server) to the internet, your development environment is much safer;
You can basically consume these requests with whichever software you want, thus you are not bound to the Jenkins implementation we are going to do;
If you have an awesome computer running, you can host the consumer part directly on it, whenever you find it necessary.

Downsides

The consumer will periodically say “Hey, queue, is there something I can do?”. If this interrogation is done too often, we might end up trying to save resources on one side, but waste them on another;
At some point you might end up with some concurrency problems;
If you have multiple consumers, some jobs being repeated or executed at the wrong time;
Things can become messy very fast if consuming entries from the queue will lead to the creation of other entries in the queue which should have priority over others.
Logging, metrics, transparency on what is going on at a certain time, these can all be heavily improved;
The solution as described is not easily scalable.

The easy way out

You could just install an configure Jenkins, as it is designed to be.

This is the easy way, but having it out into the open into the “wild internet” just gave me chills on my back, especially since at some point it needs to handle secrets.

Initially I thought of putting Jenkins in the middle of everything and behind a proxy which validated the input coming on the webhooks. But I quickly realized that the proxy did not provide as much security as was needed – one way or another, I had to open Jenkins to the internet, even if only temporary.

Producer/Consumer

The producer/consumer design pattern is used to decouple processes which produce data at different rates. It is often preferred when implementing microservices.

How it applies in our case:

Any website, service, anyone who calls our webhook will be a producer;
The queue/storage will be handled by the PHP application and it will be unlimited;
The consumer will be the the typescript application which will pass the request towards wherever we decide.

I have a general feeling that both the queue and the consumer can be simpler. With time and usage, points where it can be improved will pop up.

Webhooks

Webhooks (a.k.a. Reverse APIs) are user-defined HTTP callbacks which are triggered by some event happening in a third party service / application. When that event happens, a request is made to that webhook which in turn, executes something in our application.

Explained by example: once a PR is merged into the main branch in a Github repository, Github will call a webhook towards a notifying application with will send a message in a Slack channel saying that a new merge has occurred. Simple, right?

More info:

The proxy and the queue manager

We will use an app built with PHP and MySQL, even if it’s not the ideal solution nowadays. This way, it can basically be hosted anywhere, without any special requirements. Also, we don’t expect many simultaneous requests towards this app (since it will be called only when some parts of the code is updated in your projects).

I think it can be easily be rewritten to any kind of language and I guess the use of a proper message broker like RabbitMQ can come in handy in some cases. However, for our use case, we want to keep things as simple as possible.

We will use Lumen for completing the task. On the one hand, it is pretty lightweight and on the other hand it already implements support for most of the functionality needed in this small project. We assume you have composer and PHP already installed on your computer. However, you can also use a container for accessing these tools.

composer create-project --prefer-dist laravel/lumen automation-proxy
cd automation-proxy
git init
Commit the code at this point, to keep things tidy;
code .
If case you use Visual Studio Code and have the cli part installed, this command should open the editor at this folder.

The next things on the todo list:

Add docker/docker-compose support;
Enable the database support with eloquent, without facades;
Create the endpoint for storing webhooks, enable facades;
Create the endpoint for serving the stored webhooks;
Create endpoint to mark events as consumed;
Add minimal password protection to api and hook routes;
Add throttling to limit the requests per minute;
Add some tests.

artisan

I want to address the issue of artisan separately. Using an established framework as Laravel – or lumen, the younger dumber brother, as a friend once said 🙂 – comes with some nice tools.

You have already guessed that one such tool is artisan. It will help you clear cache, run migrations, create seeds. These are just a few examples, a lot more is available.

The lumen version of artisan has some commands missing, but it still has the part which handles migrations which is the most important, in my opinion. The following are the ones used for handling the storage of requests:

php artisam migrate:install
Setup the migration project instide the project;
php artisan make:migration create_hooks_table --table=hooks
This creates the new table hooks;
php artisan make:seeder HooksTableSeeder
This creates a seeder for the new table. A seeder populates the table with test data, for easier development;
php artisan migrate
Run the defined migrations;
php artisan db:seed
Seed the database with data;
composer dump-autoload
This command might be needed in case some of your classes are not found.

More information:

The consumer

The consumer will be bound to the Jenkins instance. This is why it is recommended that they be in the same network. It will check the connection to Jenkins before consuming, then it will retrieve the new requests stored, validate them again – as we are never supposed to trust the user input – and then pass them on to the Jenkins instance.

We will use typescript, for the sake of commodity. We could basically use any programming language, though. An interesting idea would be to use something like Agenda, which is dedicated for running jobs – at this point and for the current idea of the article/project, using it might be overkill.

The following steps will be completed in order to create the consumer:

Create a basic structure for whole project;
Implement the request for new events from the queue manager;
Create an abstract entry parser, which will basically forward the request to Jenkins;
Update the queue manager with the events that have been consumed;

Mr. Jenkins

I have written quite a bunch on articles on Jenkins and how to set it up. It is generally easy to use and customize and this is one of the main reasons for which I chose it. Another important reason is its ability of making use of agents to execute jobs – this way you can basically distribute the load and also take advantage of various tools, regardless of the operating system (e.g. an agent running on windows for automatically testing your application against Microsoft Edge, or an agent with big computing capacity able to process something very fast).

For this project, Jenkins will be installed in a container, but we won’t go through the hustle of automating its setup. We will simply install and use it as if it were a normal installation, and make sure that the configuration files are saved/backed up.

These are some plugin suggestions:

Bitbucket Plugin
Build Name and Description Setter
Configuration as Code Plugin
Dashboard View
Git Parameter Plug-In
GitHub Branch Source Plugin
GitLab Plugin
Parameterized Trigger plugin
Publish Over SSH
SSH Agent Plugin
Simple Theme

The code for the container will be in the same git repository with the typescript application. Since they are tightly bound and both are rather small projects, it makes sense to keep them in the same project.

Sources:

Test

We need to make sure that everything is fine. So we will create a test project which will demonstrate a very simple use case – calling a job which sends an email every time a webhook request is consumed. The body of the email sent will provide full information about the webhook request which has just been consumed.

A simple test:

Start the queue manager, the consumer and Jenkins locally, using docker;
Log into Jenkins;
Simulate a request to the queue:
curl -i http://localhost:9000/hooks -d "greet=hello&access_key=<the key hook set in the .env file>"
If everything is ok, a new job should start running.

More info:

A real life example

This implementation is intended for initial automation for the personal projects. I intend to build some applications and websites I have postponed for a while and also will try to improve on maintaining some of my older open-source projects. The PHP application will be hosted on my web server and the other components on my own laptop with docker, whenever I know I need them.

Closing notes

Some ideas on what you can do with your own implementation of the above:

Automated tests and builds on or for different platforms, automated updates for different projects;
Generating and publishing new versions of packages;
Some CI/CD, why not?

And some improvement ideas:

Add some automatic documentation to the API itself (e.g. Swagger);
Make it run in kubernetes;
To handle a queue entry, the consumer could use other applications, apart from Jenkins;
Create some sort of internal configuration and based on that it could inject configuration values or secrets to be used by the applications which have to process the queue entry.

Also, something important: before implementing the above, you should check out the free plans for various automation providers. If you have few projects with very-very little requirements, you can always go for Github Actions, for example.

I have been learning a lot on this topic in the past few months and did as much as possible to implement an easy and cost effective solution for automating repetitive tasks. This implementation seems to provide small scale automation which works.