Scale API calls from DB using Apache Nifi

Nowadays It’s easier to break down monolith and develop micro-services in order to scale application to small building components. Those services are talking to each other using API.

An application programming interface (API) is a connection between computers or between computer programs. It is a type of software interface, offering a service to other pieces of software.[1] A document or standard that describes how to build such a connection or interface is called an API specification. A computer system that meets this standard is said to implement or expose an API. The term API may refer either to the specification or to the implementation.

API has changed the way we operate in code development. It’s very easy to use and is wildly spread when needing to acquire data.

Organization today use API to publicly share info to:

  1. Different regulators
  2. Integration with other companies

Another important case to use API internally is internal actions and support. Every company today has some level of support divided into tiers:

1. Tier 1
2. Tier 2
3. Tier 3 + 4

Each support level is required to use internal tools from internal Back Office systems to run queries in the DB to investigate and mitigate problems.

Tier 3 and 4 , especially , are more of a technical team with a more comprehensive approach of the company architectural structure. They know what components are connected and what to do to solve an issue.
One of the tools they use is calling an internal application using API.

This is a powerful way to imitate the application flow and solve the issue in a proper and documented way.

Tools of the trade

There are many ways of calling API from a simple GET using your browser to writing scripts and using of-the-shelf-tools like Postman.

All of the above are great and can be run by each member of the team individually without any interference.

If the previous line caused you to cringe a little, then you have some sense of security & auditing in mind.

When calling internal API we need to follow the below guidelines:

A. Security — we cannot allow just anyone to do calls to internal API. We need to block the option to be used by trusted people within the organization. This is done by assigning access to specific IP’s / Mac address.
Other option is to block access altogether and ONLY the applications can call each other i.e only calls from application within that environment vlan can do API calls. A third option is to control it using VPN but require a lot of handling.

B. Auditing — If a team member runs a script from his own station will someone know?
how can we oversees such actions?

C. Consistency — lets say we agree on a single script to run our code.
What if I need to change it just a little bit so now every call I make have a different footprint?

The Most common tool to use today is Postman. It’s very light and easy to use and it only falls short on one thing: Proper Auditing.

We therefore are in need to find a way to make those swagger calls more contained , secure and capable to interrogate later.Introducing APIer

APACHE NIFI

Apache NiFi supports powerful and scalable directed graphs of data routing, transformation, and system mediation logic

Nifi is a great tool for data shipment & enrichment that can be use on many purposes and can handle thousand of flow file to be process in seconds. That is the reason we choose it for this flow as it can do:

1.HTTP/S calls.
2.Get data from the DB.
3.Update the DB.
4.Handle the load.
5.Flow control can be used on many attributes.

Solution architecture

The below will illustrate the general architecture:

APIer architecture

1. The support team get a ticket about a bad population that need to be fixed.
2. They investigate and find the required population
3. Get the data to a dedicated table in the DB (Full sql schema in the git below):
4. Nifi always listen to that table and once it finds population with status “0”[new and ready] it extracts the population into flow files and update the db back with status “1”[in transit].
5. Based on the data in that table it does the API calls ( more detailed flow below) and get back the response code for call
6. Nifi update the DB back with the response for each call.

NIFI FLOW

Using the above flow allows us to scale our calls tremendously with a clear way to track all the calls and do stats on how many calls succeeded and to what endpoint. we visualize it using Grafana for better understanding.
Some Stats:
1. Started 2019–01–23 and been running since ( 908 days )
2. Created 8760046 calls so far ,of which 6894019 were successful
3. Used by different 20 users ( some are automation flows )
4. Called 30 different internal services over time

APIer stats in Grafana

we collect all of nifi logs to our ELK stack to create alerts on the call that were not successful to understand why.

SQL SCHEMA

Inside the dedicated sql table “APIer” :

[ID] [int] IDENTITY(1,1) NOT NULL,
[MethodID] [int] NOT NULL, // method connect as for FK to APIerMethods. can control the CRUD type
[Body] [nvarchar](max) NULL,
[Headers] [nvarchar](max) NULL,
[DelayInMiliseconds] [int] NULL,
[URI] [nvarchar](1000) NOT NULL, // The End Point the nifi will call
[StatusID] [int] NOT NULL,
[CreateDate] [datetime2](7) NOT NULL,
[UpdateDate] [datetime2](7) NOT NULL,
[Ticket] [nvarchar](200) NULL, // Ticket id of the request
[RequestedUser] [varchar](250) NOT NULL,
[ResponseBody] [nvarchar](max) NULL // the status of call will be update here

The full APIer template can be found in the following GitHub repository including:
1. SQL folder with all the necessary tables and SP
2.Nifi XML ( and JSON) template.

https://github.com/mormor83/APIer

Conclusion

This flow has become our main source for the support team ( and some of the developers teams :) ) to run multiple API calls directly from the DB. with Nifi built in “ControlRate” processor we can moderate the flow files based on a few parameters. We can scale it however we want and most importantly we can log and trace each call.