We are living in a fast pace with data collected everywhere all the time – raw data is useless without a thorough understanding which comes from in-depth analysis and interpretation. For this, the data needs to be accessed with high availability and be integrated in different systems and applications. The market is full of various technologies and systems, most of them shifting the playground towards cloud. The advantages of this infrastructure option are well known: high availability, cheaper costs for scaling and faster upgrade times.
In this article we shall approach Amazon Web Services and their cloud technologies, with accent on the DynamoDB, API Gateway and Lambda Functions. These three services can be used to develop an API for the access to data and provide fast and scalable statistics.
To begin with, DynamoDB is a nonrelational, fully managed and scalable database that provides low-latency data access and an out of the box REST API used for querying and managing the data. Though it has some limitations when it comes to high volumes, DynamoDB is very useful when we want to provide fast access and high availability for the data.
Using the API gateway, DynamoDB’s API can be exposed and simplified for easier access for other applications and users who want to obtain data in a fast, secured and managed manner. In this way we can wrap the querying endpoints of the DynamoDB with more intuitive ones and hide the technical aspects, so the final user can access those endpoints without having a very detailed knowledge on the DynamoDB technology and concepts. Also, we can easily manage the access to the database, by only managing the authorisation roles used by the API Gateway integrations.
For example, we needed to expose data through a REST endpoint, with data available only in a Data Warehouse. Developing an entire REST layer over the database wouldn’t have been the most viable solution. We chose to store the data, as it was only a selective set of it, in the DynamoDB because it came with an out of the box API and could be easily wrapped by Lambda Functions and API Gateway. It was put in place a process that synchronises the data between the two Data Sources on a daily basis, using an ETL developed with SSIS and the integration between the aforementioned AWS services. This way, we have excluded the process of deployment to a server for the REST API, gained easier maintenance procedures and higher flexibility when it comes to future implementations and modifications of the table.
There were challenges, like the limitation of the BatchWrite endpoint of the DynamoDB that limits the number of actions to 25 per batch, but in the end, with better Write Capacity Provisioned configuration we succeeded the export of data to DynamoDB without having to implement complex logic in the SSIS Custom Destination.
As a conclusion, AWS provides interesting technologies that allow the implementation of serverless services with fast response times, high scalability, easy management and interconnectivity, all with the comfort of choosing different widespread programming languages. Cloud is for some time the new infrastructure reality and it is here to stay, with all the pros and constantly reducing the cons.