Twitter thrives on shares not only within the social media platform, but also through affiliate links all over the internet. Except for Monday, most of those links stopped working.
For about an hour, anyone trying to share recently published articles on Twitter got an error message that was clearly aimed at developers:
It was almost as if Twitter was informing publishers that they weren’t paying their water bills and so couldn’t post links on the social network.
What went wrong?
We didn’t have to wait long for Twitter CEO Elon Musk to explain. In response to a tweet from former Netscape founder and well-known venture capitalist Marc Andreessen, who pointed out that four of the top five Twitter trends were about Twitter, Musk tweeted, “A small API change had huge impact. The code stack is extremely fragile for no good reason. Will eventually have to be completely rewritten.”
A small change in the API had huge consequences. The code stack is extremely brittle for no good reason. Will eventually need a complete rewrite.March 6, 2023
However, this seemingly lucid tweet should be cause for alarm. Musk claims that the code stack (actually a huge pile of programs that all work together to create the Twitter whole) is fragile and needs to be rewritten. What he doesn’t mention is that between the thousands of Twitter employees he has laid off since Novembera good number of them were engineers and, it’s safe to assume, some were in what’s known as QA or quality assurance.
If you plan to make any kind of code change to a website, online service, or app, QA typically tests it on an offline copy of the platform. In this way they ensure that the updates, no matter how small, do not have a negative impact on the live environment.
The concept is known as “production,” the live site or service, versus “staging,” an environment that is identical to live but cannot be seen or touched by users. You run your new code or feature via staging, a group of QA testers apply a set of known scenarios (maybe throw in a few edge cases), and as long as there are no red flags, the update is pushed from Staging to Production .
Twitter, which has seen its overall reliability drop (from going offline to unexpected feature appearances and disappearances) since Musk took over, may be getting its updates in a different way.
Musk likes to test features on production (opens in new tab) (the live site). As a result, he keeps running into unintended consequences.
There is some disagreement over whether or not a Twitter QA team exists.
Some claim one exists, but Musk gets impatient and then pushes untested code live.
Others insist that Elon Musk arrived at Twitter only to find that Twitter had no QA team and that it had been a long time since untested code was pushed live. However, that seems highly unlikely.
I asked Musk directly on Twitter if the API update has been tested for staging before being pushed live and will update this post if he responds.
Never assume
The assumption he made here, that a small change to the API would have little impact on the site, was a bad one. And yet Musk still doesn’t understand that he’s doing it wrong.
Testing all sorts of features on a live version of a complex platform like Twitter will inevitably lead to bugs and crashes.
Will rewriting the code stack fix all of this? Perhaps, very few platforms stay as clean as they did at launch and even if the rewrite is robust and perfect, frequent updates and new features will test that stability.
As long as Musk refuses to fully test what he launches before launching it, there’s no scenario where Twitter escapes regular downtime.
This is a simple solution, Elon, make QA an inescapable part of the development pipeline and save yourself and us a lot of headaches. Or keep doing it your way, because it works so, so well.