Schedule oncall for recurring conflicts like Sabbath
Most software oncall work requires someone to be available to respond to incidents 24x7x365. Those 24x7x365 hours and days are split up in many different ways depending on team circumstances. For example, if team members are located in different time zones, it makes sense to have team members mostly cover the oncall responsibility during their day times, as per the “follow-the-sun” schedule pattern. Another example is the common approach to reducing the number of oncall shifts any one person covers, by cross-training people over multiple oncall teams, to form one larger oncall team where each person can handle incidents from any of the sub-teams. The Oncall Scheduler supports flexible shift layouts to work well with many such approaches.
One oncall scheduling circumstance which many people find challenging as they plan a team’s oncall work, is when a set of oncall engineers have a frequent recurring commitment outside work, which makes them unable to do oncall work at those times. Some examples of this are well addressed through the Oncall Scheduler’s abilities for each oncall team member to self-assign shifts, and to enter preferences for dates when they don’t want to work.
For example, if one oncall engineer has recurring military reserve training leave every 3rd weekend of each month, if you’re using the Oncall Scheduler, that is something you probably don’t need to do anything special for when setting up how your team’s shift schedule looks. That person can make that work through self-service shift assignments. They just need to spend a few minutes every few months to self-assign a set of future oncall shifts that work for their schedule, such that they never run the risk of being auto-assigned a shift that conflicts with their monthly leave. Alternatively, instead of self-assigning shifts for this, they could also enter preferences to avoid being auto-scheduled on the dates when they’re busy with the military training.
Scheduling for Sabbath
Another common example though, is not as easy to create a great schedule for: when some members of your oncall team have a weekly commitment which makes them unable to be oncall at certain times as often as every week. The most common such situation I’ve seen, is teams where one or more members observe a religious Sabbath day of rest. For example, Jewish Sabbath goes from Friday sundown until Saturday evening. People who observe this strictly cannot work during that time. How can you organize your oncall scheduling to work with a need like that?
There are several options, depending on other circumstances for the team:
- Without backup. If the rotation you’re scheduling doesn’t need a scheduled backup before or after someone is primary oncall, the simplest approach is to set up a shift layout where some shifts will not include Sabbath. This might be the case if the services this oncall supports are not critical, such that it’s ok for nobody to engage with an incident immediately if the primary oncall misses a phone call or has a personal emergency. A shift layout that could work is: Shift 1 goes Sunday 10am - Thursday 10am, and shift 2 goes Thursday 10am - Sunday 10am. Then the people who have the limitation of not working Friday and Saturday can take a few minutes every few months to self-assign enough Sunday-Thursday shifts that they’ll never get auto-assigned any shifts.
Not having a backup oncall is an unusual situation though. Most rotations need a primary and backup oncall available at all times. Assuming you do need a backup oncall, the next best option is to set things up in a way which lets shifts still be reasonably long, while still guaranteeing that both the backup and primary responsibility of the employees which observe Sabbath, do not include Sabbath time.
- With backup, and enough members. If there are enough oncall engineers on this team, at least >10, the best approach in Oncall Scheduler, is to create two separate rotations. Each would have about half the oncall engineers for this area as members. When these rotations are synced into an alerting system such as PagerDuty, both rotations would be synced into the same alerting schedule, just as how rotations are merged into shared alerting system schedules for follow-the-sun scenarios.
- Rotation “Avoid Sabbath” would include any engineers who observe Sabbath, plus enough other engineers to make up about half the oncall engineers. It would have a shift layout which does not include Sabbath. For example, it could have 1 shift per week Sunday 10am – Thursday 10am.
- Rotation “includes Sabbath” would include the remaining approximate half of engineers. It would have a shift layout which includes Sabbath. For example, it could have 1 shift per week Thursday 10am – Sunday 10am.
If you need backup coverage, and you don’t have enough oncall engineers for the area to create two separate rotations for avoid-Sabbath, and include-Sabbath, a 3rd option is to have more and shorter shifts. This comes with the drawback of spending (or “wasting”?) a lot of time on oncall handoffs, and engineer’s lives and work being more diced up with frequent switches between the normal everyday, and interrupt-driven oncall work.
- With short shifts. You could have 1 rotation with a shift layout that enables some backup+primary combinations to avoid Sabbath. Then the employees who observe Sabbath can spend a few minutes every few months self-allocating such shifts, to avoid getting auto-allocated any shifts which do include Sabbath. E.g.
- Can observe Sabbath: Someone working shift 1: Backup Sun-Wed, Primary Wed-Fri (handing off at 10am).
- Can not observe Sabbath: Someone working shift 2: Backup Wed-Fri, Primary Fri-Sun
- Can not observe Sabbath: Someone working shift 3: Backup Fri-Sun, Primary Sun-Wed
There is a hypothetical option D which Oncall Scheduler doesn’t support, and I'm not sure anyone would want to use. Today the Oncall Scheduler supports engineers being on backup duty the shift right before they are primary, or right after they are primary, or not having a backup at all. All oncall rotations I have ever seen in the industry follow one of these patterns. But for the Sabbath scenario, if you don’t think either of the solutions A, B, or C, will work for you, you could petition kristian@timewesp.com to add a new feature to the Oncall Scheduler: support for engineers to be backup 2 shifts before they are primary, or 2 shifts after they are primary. That would enable a shift layout with 2 shifts/week, and a Sabbath-observing engineer could work shifts where (e.g.) they are backup Sun-Thurs in week 1, and primary Sun-Thurs in week 2.
A drawback of option B, and the hypothetical option D, is that there would be a gap of multiple days between when someone is backup, and when they are primary. That would reduce the efficiency of both backup and primary work, as engineers would carry less context about current incidents and incident-prone work between their backup and primary oncall times. But I think that's less problematic than how option C makes the oncall shifts so short that engineers have to context switch between oncall work and other work more often.
The goal of Oncall Scheduler is to give all oncall engineers self-service control over how they are scheduled, and to make their oncall work fit in with the rest of their work and life. Solving how working oncall fits with observing Sabbath is a great example of that.