Observability, Post-Deployment QA

Moniotring Overall Health
Moniotring Overall Health

Observability, Post-Deployment QA

Mar 19, 2024

Most modern tooling uses services such as Sentry or NewRelic to boost their observability in both production and in lower-down environments. 

With the convergence of certain things, like browser render and JavaScript support, the requirement to test across different browsers and devices is no longer quite as essential as it once was.

But real users, your customers, might be facing issues that you did not foresee, and without large amounts of effort/time/resources, you cannot capture everything. Plus, the issue might be non-browser-related, such as location, certain operating systems, edge-case user journeys, or different screen types.

You can observe though. If you have a high enough user throughput in your application, you will quickly be able to shape your Golden Threads that users are doing and ensure that the top 90+% of your users will be unaffected by a change. These may be completely different from how the business imagined the tool would be used. 

Here at DoesQA, we use the observability platforms to monitor everything, the runner health (CPU/Memory/Network), all of our event-driven architecture, and most importantly, the web app itself. We can trace and see exactly how our users are interacting with our tooling, and draw patterns and conclusions about how we should update something. 

Take this example that came up recently when one of our amazing users raised an issue. The user had a Touch for an Element before a Select node for the same Element. I looked through the data, observed the number of times this was done, and found this user was in the <1% (in fact, they were the first) to do this. It helped us understand the severity of the request, and act accordingly.

In the above example, the user reached out to us, but we were already on the case. Our alerting had notified us of an increase in the test case "started" to "fail" ratio for that account. We had already started looking into why and planning an update to protect against this edge case. It was a hidden superpower that allowed us to arm ourselves with a clear answer for them without having to ask for steps to reproduce or anything like that. 

We have 100% uptime on our production platform, something we are very proud of, and our runners are 100% reliable, and a large part of this is down to building in these layers of observability from the start. We can see issues happening lower down, without the need for additional testing, by leveraging these tools. 

Involving your QA team in these tools will give them superpowers, let them focus their energy on making your application stable for your users, and help them push back on areas where the users are nowhere near. 

Once you have confidence in these areas, your teams can focus on the polish around the outside. 

We love our observability for many reasons, and you should start if you have not already!

Now give these buttons a good test 😜

Want Better Automation Tests?

Want Better Automation Tests?

High-quality test coverage with reliable test automation.