Reading Time: 10 minutes
A few weeks ago, we discussed how to prepare your culture to move beyond agile. Let’s go a little further this week and examine how other companies have figured out what does and doesn’t work for them when it comes to development process.
Breaking the Rules of Scrum at Spotify
Spotify, for example, started as a scrum company. But as they grew, the scrum practices got in the way. They began to break the “rules” to fit their company’s growth.
First, scrum masters became agile coaches. Then they changed their scrum teams to squads that own the process from start to finish. Each squad has a long-term mission along with autonomy on how to go about achieving their goals. However, the squad does have boundaries they must adhere to around their mission and product strategy as well as short-term goals.
Spotify has found that giving people more autonomy gives people greater enjoyment and makes the company faster to respond to change. Decisions are handled locally so the squadron can respond quickly. There is an overall mission for all teams to move toward. They use the analogy of a jazz band in which the players listen to each other but make their own music.
Cross pollination > standardization
As a result, Spotify doesn’t have formal standards on how they do development. Each team has its own way of doing things. Ideas may pass from team to team via cross-pollination as the squadrons communicate with each other, but the overall architecture is composed of small services that each team owns from front to back. If one team needs something from another, they can ask the other team if they have the time or edit the code themselves.
Community > structure
Each employee is a member of a squadron; many squadrons make up a tribe. Overall, the corporate groupings are very fluid and can change often. They strive to build a strong community instead of a rigid corporate structure.
Spotify focuses on making it easier to release code and releasing it more often. This is the opposite of the typical waterfall environment where releases are big, seldom happen, and can be painful. They want releases to be common and routine, not rare and dramatic.
Spotify’s release trains
Each application at Spotify has a release train that happens at a regular interval. If a feature is not complete when the train leaves, the feature is used to shut it off. The feature toggles work to hide code in testing and production environments. This also helps alleviate issues with merging code as it is always checked in one place.
Release trains work on a dependable schedule — similar to a real train — which provides a fixed cadence and predictable planning. Multiple development teams can push features and code into the train. One group is dedicated to the train, and its members are from various backgrounds. Dates and quality are fixed where the scope is the variable in the system. Each team has similar iteration lengths and velocities to support planning.
The release train has a set of roles that manage it, starting with the release train engineer (are we taking the train metaphor too far?) who is similar to a scrum master. They keep the process going. Release management will make decisions on scope and plan releases. Product managers have authority over the content to set the product backlog working with the product owners on the scrum teams.
How to set up your own release train
Want to try this on your own team? The first step of setting up a release train is to determine the release train domains or who will work on what. On web applications, for example, a natural breakdown could be the frontend and backend services. You would have two teams feeding code into the release train.
After you’ve settled on the features, products, and components for the application, it’s important to determine the iteration of the teams and release schedule. Support work for infrastructure, common components, and user interface design will need to operate ahead of the development teams.
Full integration should take place on a regular interval, say, two weeks. Along with this full integration of the work, demonstrations to stakeholders help to report progress and solicit feedback.
Em Campbell-Pretty, a partner at Context Matters, suggests that teams already familiar with agile will make the transition to release trains more quickly. She describes how they tried this out at one her initial engagements with a release train: “We decided that the five most mature agile teams would become the EDW Agile Release Train, and we would work to transition the other teams into the train over time.”
How to set up feature toggles
Feature toggles are a way to turn your code or feature on or off depending on the environment or user. For instance, say we want to add an email button to a screen. We could turn this off until we have the feature completely coded and tested. This would be a user feature toggle; a code feature toggle would be another version. The code feature toggle would turn the code off and on in the underlying application. Martin Fowler and others have written about this topic for a while, but it has gained popularity in many organizations.
- Less branching. When you check in code with feature toggles, you can turn off incomplete portions. I know have dealt merging issues more than once and that is never fun.
- Phased rollout. You don’t have to release all the changes at once. I really like this after having to roll back entire releases and hoping we didn’t miss something.
- Safe to fail. This is important to many managers out there. You can simply shut off things that cause problems.
You can further classify your feature toggles into release toggles that the development team can use to push out features. So if you’ve got a release with an incomplete feature, simply have that feature turned off. Once it’s complete, turn it on in the next release. Of course, it’s important to remove the feature toggle once it’s fully released!
Another kind feature toggle classification is for business toggles. Perhaps you want to add a new option for your paid subscribing members to your application. Business toggles can be turned off and on to test responses or change marketing.
You can set up any of these feature toggles in an application properties file. For instance, in a .net application, you could set up these toggles in the app.config or web.config. Next time you want to change one of these feature toggles, don’t release new code — just change your app.config. No more rolling back code if there are issues with a feature.
Leaving room for failure
Spotify’s founder, Daniel Ek, has said, “We aim to make mistakes faster than anyone else.” They want to fail fast to learn fast and improve fast. Spotify tries to create a fail-friendly atmosphere where they put emphasis on failure recovery rather than failure avoidance. By conducting a post-mortem after each failure, they try to capture all lessons learned.
Spotify applications are architected with decoupled components to use a “limited blast radius” method. If something blows up, it doesn’t affect other components. They also use gradual rollouts to allow only a few people to see a new feature. This keeps failure contained, monitored, and measured. Each team performs small experiments constantly and reacts to them.
Before a Spotify team builds a new feature, they develop a narrative around the idea. They ask questions such as, “Would anyone want this?” and “Does this provide value?” Using lean startup principles, they build the minimum viable product. They test it by releasing it to a small number of people. By analyzing data and tweaking the prototype, they may eventually roll the feature out completely.
One principle obvious throughout Spotify’s entire development process is that the company seems to value innovation over predictability. They do very little planning to allow for more experiments and innovation. In this way, their squads have more freedom to achieve overall objectives and create new value.
Measuring the Continuous Agile Model at Assembla
Andy Singleton, CEO of Assembla, has written a book called Unblock that details how to use the continuous agile model to develop new products faster. Teams that release faster innovate more often and improve more rapidly.
Like Spotify, Assembla initially used Scrum for development process. As the company grew, they saw how Scrum didn’t effectively scale for larger teams and rapid releases. They began to break things up and eliminate large releases with a goal of releasing more often. They created a process of continuous agile and non-blocking development.
Using lean principles, Assembla began to focus on one thing at a time. They created an automated environment where nothing was hidden in manual commands; everything was visible to the whole team and in repeatable scripts. This gave people the ability to pull what features were ready.
Monitoring, automation, and test layering aid in frequent releases
Assembla constantly measures their usage. As a result, they’ll only work on features being used, rather than take the time to work on the unused portion. If something isn’t used, it’s removed.
Once you have the measurements in place, you can put out frequent releases and collect measurement data to guide your future. The principle of frequent releases was important for Assembla to gain a competitive advantage. Before this change, they were lagging behind their competitors.
Assembla also leveraged automation in the build, test, and deploy cycles to make these frequent releases possible. They ask questions like, “Where can we use machines more?” and “What can we automate?” Andy points out that you can add layers of testing to increase quality in your releases. Alternatively, you can remove layers to speed up the frequency of releases. Unit testing, code review, and human quality assurance are just a few examples he shares of test layers.
How to set up test layering
Test layering can start with unit tests using mock objects. These unit tests can be run from a developer’s machine before they check in code to the repository. This can be automated by setting up a continuous integration machine that runs all tests after any code is checked in.
Add another layer to this by having integration testing on a virtual machine to simulate the production environment. Make sure you have a VM with a database similar to the production or whatever environment you’re trying to test; this layer helps test any assumptions that developers have made in coding their solution.
Similar to the last layer, you’ll want to have integration tests plus a system test that will run in a shared environment, usually in a continuous integration system. This layer tests recent code changes that could prevent the team from moving forward. It can also check source control issues, merge issues, or batching changes.
Your second-to-last layer should be a mixture of load tests, disaster recovery tests, and performance tests that can be performed in a staging or pre-production environment. Ensure that this environment is as close to production as possible. These tests should be run regularly at a frequency that the team can determine.
The last layer of tests should focus on the production system. These check the system for partial outages and monitors how they are handled. This layer is a style of testing that the development team at Netflix has gained some notoriety for: the chaos monkey.
Separating release from launch
Similar to Spotify’s feature toggles, Assembla uses feature switches that give them the ability to hide functionality until it’s ready to reveal.
Andy talks about separating release from launch. As you work on a new feature, it’s hidden in the numerous releases that come about over time. As a result, the delivery process of releases changes from the big batch release to continuous and frequent releases of small changes. Enhancements are put out and measured to see what value they bring and changed or removed depending on the metrics.
It’s interesting to note that, in the end, Assembla developers are responsible for quality. They decide when a feature is ready for release. This puts the onus on the developer to produce quality code and gets them more engaged in code reviews and testing; the quality assurance staff becomes more of a consultant that aids in the testing strategy.
Beyond Agile: In Conclusion
If we examine these different approaches from Spotify and Assembla, we can begin to see the variety of methods available to us for managing the challenges that technology teams face as we try to move beyond agile development.
In the future, many of these “beyond agile” practices will become commonplace, but remember that the first step for building devops in your company is getting all of the roles on a team working together. Collaboration is a big change from the old “throw it over the wall” approach. Working closely with the business along with all technical staff to understand priorities and deliver value is a key part of devops.