close
close

topicnews · October 24, 2024

5 Tips for Designing Highly Effective Mobile SLOs

5 Tips for Designing Highly Effective Mobile SLOs

Service Level Objectives (SLOs) are a familiar concept to DevOps professionals and Site Reliability Engineers (SREs) as they are critical for monitoring system health and alerting to disruptions. While SLOs have traditionally been the domain of backend engineering, their value is evident when it comes to helping mobile teams ensure high-performance apps and make prioritization decisions between feature and reliability work.

However, for many companies, mobile SLOs are a new and sometimes intimidating endeavor. But they don’t have to be. Any team can effectively adopt SLOs as part of their mobile observability strategy by following some best practices. Here are five tips for designing highly effective mobile SLOs.

1. Think in terms of end-to-end user experiences

For those with a DevOps background, it might be tempting to bring familiar concepts about endpoint availability, latency, etc. directly to mobile devices. However, when building mobile SLOs, you need to shift your mindset to look at end-user experiences in their entirety. You should build your SLOs around an end-to-end flow or activity that you want to optimize, such as: B. a registration or search process, and not on the individual technical components that make the process possible – such as. E.g. screen renders, API calls, etc. The technical actions are events within your SLO that can be isolated with the right tools when it comes time to troubleshoot an issue, but they should not be the focus.

2. Measure the number of user impacts, not just incidents

Events that take place on mobile devices can have unexpected impacts on your user base that are above or below your expectations. This is because mobile data is largely driven by the concept of individual users and unique sessions, whereas backend data is not. For example, if you notice 1,000 incidents of a certain type of error, how do you know how those incidents are distributed across your users? Did 1,000 unique users experience the error once, or did one unfortunate user experience the error 1,000 times?

If you only measure the number of incidents, it is impossible to know.

As a result, you may be sounding the alarm about SLO violations too harshly or too loosely. To truly understand how you should prioritize your response to SLO violations, think about both user counts and event counts.

3. Identify the user flows that have the greatest influence

Ultimately, the purpose of SLOs is to prioritize and manage technical work in a way that serves your business. Therefore, when considering which mobile SLOs to create, it is critical to identify the user flows that have the greatest impact on your business so that there is a clear understanding of why a breach of the SLO will force your team to do so will prioritize this issue over other work.

Start with the most direct and obvious indicators of business impact. For example, if your customers cannot successfully pay through the app, this will directly impact your sales. A problem with push notifications, on the other hand, can result in a gradual decline in sales, but is far enough back in the sales funnel that a disruption in this functionality shouldn’t result in your technicians having to drop everything to fix the problem.

4. Avoid random sampling

One of the big challenges with mobile data is that there is a lot of it. You may be used to querying data that is fed into backend SLOs to reduce data processing and storage costs. This makes sense: after all, you’re dealing with a predictable environment made up of a limited number of device types and other relatively stable variables.

But when it comes to mobile devices, these assumptions don’t apply. There are almost endless variations in device types, operating systems, app versions, network conditions, local infrastructure, etc. This means that a sample of the data you feed into SLOs almost guarantees that you will miss important insights.

5. Define the population you really care about

How do you analyze a mountain of high-cardinality data if you don’t sample? And what does this mean in practical terms for your mobile SLOs? This quagmire can largely be solved by focusing too much on the populations you really care about. This is in line with the suggestion we made in Tip 2 on business goals.

Consider which groups of all users of your app are responsible for the majority of revenue. Depending on your business model, this may mean using paying customers instead of free trials. Or they may be people using the latest version of your app rather than laggards. Or even people living in specific geographic markets account for 80% of purchases in your app.

The point is that it is impossible to aim for a perfect experience for all users at all times: you would spend all your time on reliability and none on innovation. However, if you can isolate specific business-critical audiences, you can refine your mobile SLOs and resulting error logs, so you limit disruption to other important technical work when reliability becomes an issue.

Continuously iterate and learn

There’s a lot more that comes into play when crafting your SLO strategy, as every app is unique when it comes to its user base, product goals, and revenue structure. The tips above apply in almost all cases, but you should always consider your individual customer and business needs and plan measurements accordingly.

One of the great things about SLOs is the ability to iterate, especially for mobile devices where there are no “universal standards” or strict expectations yet. Don’t be afraid to keep measuring and iterating as you better understand your app’s performance benchmarks and what levels of error your users are realistically willing to tolerate.

Maintaining strong app performance is a long-term endeavor. Therefore, consider SLOs as a guide.

If you want to delve deeper into mobile SLOs, including more detailed best practices, examples, and templates, download Embrace’s free mobile SLO guide.


group Created with Sketch.