Efficiency and robustness are arguably the two most important factors critical to the success of any project, more so for advanced analytics projects. With large amounts of data and a plethora of analytical tools at disposal, multitude of methods and solutions may exist that produce comparable results within acceptable statistical limits of one another. Almost always, their impacts depend on their efficiency, in terms of time and computational resources, and reliability. You find yourself striving for rapid progression from ideas to prototypes to production. At the same time, amidst the time crunch, it is imperative NOT to compromise on the accuracy and precision of the deliverables.
1. What are microservices?
Fueled by these necessities, among others, Microservices architecture, aka microservices, has emerged as a popular architectural style in the recent past. More and more projects across industries are adopting microservices in lieu of the traditional monolithic architecture. Microservice architecture comprises a collection of small, autonomous, loosely coupled components, each of which accomplishes a specific goal. Some of the key characteristics of microservices are:
- Each component is an independent codebase that can be deployed simultaneously by multiple development teams, facilitating convenient reusability and consistency across teams.
- Components are responsible for persisting their own data or external state obviating a need for a separate data layer to handle data persistence.
- Inter-component communication is established via well-defined APIs, such that one component is agnostic of the internal implementation details of other components.
- Components don’t need to share the same technology stack, libraries, or frameworks.
2. Advanced analytics perspective
In advanced analytics projects, these provide a convenient means of building the analysis pipeline. For example, let us consider a typical business problem where you are trying to run a ‘regression’ analysis to forecast the number of participants in an event. Your data sources may include records of past participation, results from a market survey, and some hard and soft constraints like parking availability, ticket price, and so forth. A typical analysis pipeline for this may comprise data ingestion from heterogeneous sources; data cleaning such as outlier removal, normalization etc.; joining heterogenous data sources; feature extraction/creation; visualization and presentation of multi-dimensional feature space; creating, cross-validating and saving a prediction model; performance evaluation; and finally, applying the model on the real data. In a microservices architecture, there are battle-tested independent components at your disposal, for all or multiple of these steps, with well-defined communication standards; facilitating a fast and robust deployment.
3. What are other situations where microservices yield high value?
- The efficiency and reliability are even more valuable when you are working in collaboration. The inter-component communication via well-defined APIs mandates consistency of data input-output form and factor across the team.
- Use of microservices yields high value in recurrent tasks by reducing the redundancy. Consider a situation if you were required to forecast for multiple events or for the same event every month.
- Another valuable use-case is when multiple personnel/teams are working in different projects that share some functionalities. For example, someone else in your company is working on a different ‘classification’ analysis but needs data ingestion, data cleaning, visualization, etc. for his project, while you are working on the ‘regression’. Both projects can simultaneously benefit from the use of respective microservices, improving the overall efficiency of the company.
4. What are the drawbacks?
- Each component is built and tested by a specialist or a team of specialist knowledgeable in that area. Once in place, it serves as a black box for the users. The intricacies of the components are no longer practiced in regular deployments. In the long run, this may create a knowledge gap.
- Since the inter-component communication is established via APIs that require strictly defined inputs and outputs, lack of flexibility may be considered a hindrance.
- From architectural standings, in a complex pipeline comprising multiple microservices, the communication overhead may be higher, particularly when each component runs in its own machine in the cloud.