This Case Study will demonstrate how Substantial, a software consultancy company, used Thundra for monitoring and troubleshooting of serverless applications to mitigate potential observability challenges. Thundra helped Substantial ensure stable, visible, and healthy data pipelines.
Substantial is a software consultancy company, headquartered in Seattle, WA, combining world-class development, design, and strategy to build products for its many clients—among them Fortune 500s, startups, and nonprofits. Not only do they build products for customer projects, but they also build their own, such as Hello Epics , a Trello Power-Up.
Substantial has a fully serverless architecture on AWS to keep operational costs low and to maintain high availability. Hello Epics is fully serverless and is supported by their team of around 40 developers, DevOps, and data and analytics team members who collaborate to design, build, and operate this serverless product.
Hello Epics is a Trello add-on that helps end-users to track and manage complex projects by creating linked groupings of cards. Since it is a B2B product used by more than 700 teams today, data analysis challenges arise each and every day.
From the first development phase of the Hello Epic project, challenges occurred including resource and execution limits, monitoring, and etc. Substantial attempted to address these issues by using Amazon DynamoDB, AWS Lambda, Amazon S3, Amazon API Gateway, Amazon CloudWatch, Amazon CloudFront, and likely other AWS services as well as other third-party services such as Chargebee and Trello.
Substantial needed to have deep visibility into the AWS Lambda functions and its environment in order to understand the issues users were experiencing. Also data & analytics engineers needed to monitor performance over time while responding to those issues. They were seeking granular cost monitoring which would put them completely in control of their bill.
Designing and maintaining a dedicated information processing environment on an effective data pipeline led Substantial having a highly-distributed serverless architecture. A single fine-tuning in the system was leading to unknown increased costs.
Since Hello Epics is a B2C product and used by many teams across the world, the spikes in the traffic are very unpredictable. The instant scalability of serverless fits the product’s use-cases perfectly. Developing their services in serverless helps to control the cost really easily and also availability is not a problem anymore, thanks to AWS.
Substantial’s engineering teams started using AWS technologies a long time ago. The wide range of services gratifies the needs of the serverless application development requirements of Substantial engineers. They use a lot of AWS services and also APIs like AWS Lambda, Amazon CloudWatch, Amazon S3, Amazon API Gateway, Elasticsearch, and some external APIs such as Stripe, Trello, Chargebee, and etc.
With all metrics, traces, and logs displayed in a clear and understandable way, Thundra offered numerous benefits. It gave Substantial’s developers the power to search for specific transactions and to understand their performance. Thundra also helped measure latencies added by third-party APIs in transactions.
Figure 1 - Screenshot from one of Hello Epic’s AWS Lambda function’s trace chart of an invocation.
Thundra automatically discovers the distributed data traces and chain of invocations inside a serverless architecture—including latencies and data flows between functions—and visualizes them in an architectural diagram. This also gives a view of what is happening in the functions method by method and line by line. This combination of distributed and local tracing is called full tracing.
In order to achieve zero latency with Thundra, Substantial took advantage of the async monitoring feature. This separate Lambda function was quick and easy to set up and allowed logs to be sent to Thundra, without any significant cost increase.
Thundra helped Substantial identify that the company was dealing with timeouts due to a misconfiguration of Amazon DynamoDB for some of its transactions. By making a small configuration change in their connection to Amazon DynamoDB, these timeouts were completely eliminated. Thundra also helps measure latencies added by APIs in transactions. This is particularly important if your system interacts with third-party APIs like Stripe, Auth0, etc.
“I'd consider Thundra a must-have for any serverless application we're building for our clients. The granularity Thundra provides into our distributed microservices is a time and money saver.” -Aaron Jensen, principal developer, Substantial
Thundra has been instrumental in helping Substantial overcome observability obstacles in its serverless applications. It has given Substantial the confidence to take on customer projects in serverless using AWS Lambda functions. The deep visibility of serverless architecture offered by Thundra has helped to discover bottlenecks with respect to health, latency, and costs. It has also let Substantial run design, development and execution processes for its serverless applications hassle-free.
Figure 2 - Screenshot from a subset of Substantial’s serverless architecture
Aaron Jensen continues: "Thundra is now a must-have in our toolkit when building AWS Lambda Serverless applications. It's a clear improvement on the built-in AWS tools and other similar products and gives us the insight that we need to keep things running smoothly for our customers and clients."