Comprehensive software reviews to make better IT decisions
Everything That Can Go Wrong Will – Use Feature Flags to Manage the Risks
Edward A. Murphy Jr. was an American aerospace engineer who worked on safety-critical systems for the United States Air Force. You might wonder how this is relevant. You see, even though you might have never sat in a fighter jet or were probably learning to tie your shoelaces when he died in 1990, you have either heard of or used his most famous invention: Murphy’s Law.
Murphy’s Law and product development
Product companies that rely on keeping their customers engaged understand that the speed of deploying new system capabilities is important. Their development teams are furiously pounding away on their keyboards, crafting code that they hope wows the customer and ensuring their work is properly merged in source control and seamlessly delivered for the world to use, enjoy, or hate (depends on which side of the bed the user woke up that day).
Feature after feature is added to the main branch, and an automated deployment agent is standing by to lift the code libraries and carefully place them in a production server. If configured properly, the entire end-to-end process is like a symphony, never missing a beat and flowing from idea to value seamlessly.
Enter Eddy and his ominous warning. Turns out a new feature that was added at the last minute and not thoroughly tested has an edge-case defect, very unlikely to occur. However, if it does get invoked, it can lead to a major security flaw that can compromise a user’s sensitive data and can ruin the company’s reputation.
Don’t call your lawyers yet; the development team uses feature flags
Feature flags, also known as feature toggles, allow safer delivery of features by decoupling deployment and feature releases. Think of them as “kill switches” that can be used to dynamically reconfigure a production system if necessary. Feature flags support continuous integration/continuous delivery and support a culture of experimentation to keep adding more value to a customer’s engagement.
Use feature flags in a repeatable way to ensure predictable outcomes
This analyst always found irony in the DevOps mantra of making processes predictable to figure out how to be innovative. In other words, use standardization to discover nonstandard ideas. This was, of course, before I realized how feature flags could support DevOps inspired delivery pipelines.
In the spirit of standardizing the use of feature flags in product delivery, the following are some best practices to follow when using them:
1. Turn flags on or off on the server to avoid cache invalidation challenges.
Modern web applications, such as SPAs, are feature rich. The desire to give users that incredible experience that will blow their minds has created a whole host of techniques (like caching) where the browser is literally the computing agent (and with technologies like WebAssembly, this trend is only going to get more pervasive). However, just because client-side processing has become faster does not mean it’s necessarily a good idea to manage flag toggling there.
For one, client-side caching (which is a widely used technique to improve performance of an application) interferes with the synchronicity of a flag’s status on the server and on the client. If the client manages the flag’s setting and the kill switch for the feature is turned on at the server, what guarantees do you have the client is connected to the server to pick up on the change? Is there an automatic push mechanism from the server to the client?
2. Feature flags managed on the server exclusively reduce implementation complexity.
Yes, rich clients are more powerful now with all the processing strength available for our local machine. Nevertheless, there will always be a situation where both server and client will be managing feature-flag state for different flags, and the fragmentation of processing across two domains can get really complex in a hurry.
3. Make feature-flagging decisions close to the point of first contact with the user, especially in distributed microservice architectures built on a domain-design model.
Microservices have gained tremendous popularity when it comes to creating modular services that manage one and only one business domain. In an intelligently designed retail web service, one would expect a microservice for customers, one for checkouts, one for ordering, and so on (with the caveat that sometimes its not as easy as alluded to here).
If the company wants to experiment with a feature flag for free priority shipping for a certain type of customer, it makes most sense to toggle that at the customer domain. Realistically speaking, the flag can be manipulated at the checkout or ordering service, but that breaks the rule of domain independence. It is always better to make the feature-flag decision closer to the customer domain because its outcome is most valuable for them.
Making feature-flag decisions close to the business logic controlling the flag’s toggle makes it unnecessary to share the user’s context with other code modules, ensuring the principles of modularity in code writing are maintained. This practice is, however, not easily implementable and requires strong architecture discipline.
4. Don’t forget making the code changes for supporting feature flags testable.
Code testing for feature flags can span the macro and/or the micro. The macro-tests are black box tests and should only be concerned with ensuring the expected experience for a user when the feature flag is turned on.
For those who are interested in auditing the intermediate steps that follow the activation of the feature flag, process-steps-level testing should be done, at a minimum, for both states of the flags (i.e. on vs. off). Many times has this analyst come across situations where feature flags were only tested for the “on” state based on the assumption that the “off” state was what got tested as part of the core regression or functional test cycles.
5. Database changes must be done systematically and with extreme caution.
Code changes to production systems usually require a change to the data model. The schema needs to support any newly deployed code, and sometimes that means applying a migration to our database schema.
The migration can take either of two options: Expand-Contract or Parallel Updates.
With Expand-Contract, the safe option is to update the data model (Expand) without putting any referential constraints on newly added tables. Once Expand is done, use code changes to write to the new and old data model simultaneously until there is a switch over to the new tables completely. At this point, the old data model is reneged, and the data model Contracts to a smaller size.
Parallel Update requires both code and data model changes to hit the production at the same time. Unless thoroughly tested, this is a risky mechanism and can lead to severe outcomes.
6. Remember to clean up flags that are no longer relevant.
Your A/B test was a success. The customers loved the priority shipping option for any order volume over Can$50. The product team is convinced this is a feature worth rolling out, and the enhancement is made public on the weekend. The flag served its purpose, and now it needs a graceful retirement.
Make flag retirement a project task that needs to be completed for project closure.
The principles of Lean and Kanban demand that every bit of work that needs to be done should have visibility. If you don’t see it, you won’t know it has to be done. The same applies to retiring feature flags. Make it a part of the project closure tasks to make sure it done.
7. A feature flag by any name is a confusing flag.
Like all variables we code and give unique names to, feature flags are the same. Instead of calling a flag “priority_shop,” better to label it “priority_shop_front_UI” or “priority_shop_front_DB,” etc.
8. Use server jobs to retire flags.
There might be a reason for using local cron jobs (or similar structures) to periodically check the flag’s best-by date and toggle it off when the conditions are satisfied.
9. Integrate the outcomes from feature flags into a feedback loop for measuring impact on users and systems.
Using feature flags, businesses make managed changes to the system in the hope that the impact is observed and the business may be improved. When assessing the impact of the new feature, it’s important to not only look at the increases in revenue generating metrics but also at other less-obvious indicators like system reliability, scalability, and operational loads.
Feature flags help product delivery teams by reducing the risk to code deployments because of feature releases. Feature flags provide a mechanism for feedback and iteration by linking features to changes in engineering KPIs and product metrics.
We all claim to be learning organizations who love to experiment. Using feature flags to assist with continual evolution of the business is a tested approach for many of the tech giants in our world. We might not be tech giants, but who’s to say we can’t behave like one?
Want to know more?
While Microsoft is not a prominent player in the RPA space now with its Power Automate solution, compared to Blue Prism, UiPath, and Automate Anywhere, its latest acquisition of Softomotive, maker of WinAutomation, demonstrates Microsoft’s dedication to mature and expand its RPA offerings.
Test data management tools offer you the ability to provision, mask, and govern the access and use of your test data, alleviating these manual, laborious and error-prone tasks from your testing, operations, and DBA teams.
When trying to implement Agile as a defined process, Scrum turned BAs or other roles into order takers with the title “product owner.” This undermines the entire value proposition of product management.
Agile systems delivery (implemented through Scrum) is quickly becoming an accepted norm in IT. But using Scrum successfully in an organization requires a deep understanding of how it works and why. For example, many of our members don’t understand the importance of selecting a Product Owner who has three ears.
Reeling from the pandemic response executed by governments all the over world, companies are accelerating their implementation of low-cost automation. That bodes well for UiPath – a leader in RPA aiming to go public this year.
Thor, the Norse God of Thunder, tells Jane Foster, the woman he’s trying to impress, that on his home world of Asgard, the realm eternal, science and magic are two sides of the same coin. Had Jane been a part of the operations teams at Google (or other mature online service providers), she would have immediately realized we have a similar technology right here on good old Earth. We call the science site reliability engineering (SRE), and service level objectives (SLO) is the magic behind it. SRE is a powerful concept for organizations that are serious about keeping their customers happy. It is therefore important for them to develop well-thought-out SLOs and make certain that management is intellectually equipped to derive valuable business perspectives from them.
Hell hath no fury like a customer not being able to access an online service when they want to. They expect the online services to always be on, always be accessible, and always treat them like there’s no one else in the world who matters more. Thank heavens then for giving these online services the ability to use site reliability engineering (SRE) to keep their customers happy, engaged, and most importantly, feeling valued.
Info-Tech members moving to Agile are frequently unsure of the role of PMs and the PMO in an Agile environment. Any organization used to traditional (Waterfall) project management will need to make adjustments in support of Agile or risk losing the benefits.
GitHub has announced that, effective April 14, 2020, all of its core features will be free for everyone. This will include private development within organizations that have previously paid for some subscription plans.