A big rewrite

December 18, 2018 6-minute read

softwaredevelopment • rewrites

Folklore and common sense warn developers and teams against doing big rewrites.

To do or not to do

There are many reasons not to rewrite apps from scratch:

rewrites take time
the “legacy” app still needs to be supported and probably debugged
rewriting the same exact app hoping to change the outcome could be an early sign of madness
requirements will definitely change from when you start to when you finish
the company already paid for the “legacy” app, now you want it to pay for the same thing twice
management will probably be hard to convince

If the rewrite is justified though, there are some positive aspects:

the time spent rewriting is time you’re going to learn a lot
the “legacy” app has become garbage fire (because of turnover, feature creep, bad design, lack of expertise, Saturn in opposition, whatever) and is slowing growth
even if you think so, you won’t end up rewriting the exact same thing
the changes in requirements might result in a different, better, product
if you’re not a startup, the “legacy” app is usually funding the rewrite anyway
management and your colleagues will trust you a lot in future years if you all manage to pull this off
you get rid of all the tech debt just by deleting a folder (and you get to create brand new debt :-D, but let’s not be picky)

Why I’m writing this

The other day I read two “old” posts about a successful “big rewrite”.

In the first one, Against the Grain: How We Built the Next Generation Online Travel Agency using Amazon, Clojure, and a Comically Small Team, Colin Steele narrates a journey of moving from a giant pile of tech debt that was going to sink the company to a successful re-engineered product. In the fray there are mistakes made and… a succesful acquisition from another company.

The product is a hotel meta search engine.

The bulk of their story

Premise

they initially had the wrong business model (quite common with startups, at least in my experience)
the app was a spaghetti of monolithic PHP probably worked on by many hands
the database was a mess
there were no tests
he was hired as a consultant and extracted a key feature using Ruby and async programming but the rest was too far gone in his opinion

Pre-execution

he became CTO of such company and convinced management to attempt a rewrite
they fired most of the existing devs and hired just a handful of seniors (another common theme in startups in damage control that are draining money)
they started the rewrite while keeping the old product running
they switched from hosted servers to cloud (keep in mind that this happened in 2010) which took convincing

Tech choices

the frontend dev wrote a SPA with vanilla JS (again, in 2010)
after thorough testing and some guts they settled on Clojure (even if he was a Ruby expert). Ruby was abandoned because it required more resources to scale and they had none and because of its builtin concurrency model
Clojure was the right choice from them. As he wrote: as the CTO at a cash-strapped startup, Clojure was the answer to a prayer.
Clojure was probably an easier sell than usual because how tight they were with time and resources and how well management trusted the team (it would be a though sell in 2018, imagine in 2010)
the type of web app they had and the performance testing they performed justified the choice (and saved the company money)

Post execution

they were acquired at the end of 2011
all of the tech choices they made were questioned (why AWS, why Clojure)
he says that they were able to “sell” the choice of Clojure to the new company because it sits on top of the JVM and because of the nice graphs about the performance of the system he showed them

End of the story

From the second post, 60,000% growth in 7 months using Clojure and AWS

Over the course of the last 7 months (we launched in January 2012), we’ve gone from about 1,000 uniques/day on hotelicopter’s site, to 600,000+/day on roomkey.com. That’s 60,000% growth in 7 months

So, the rewrite paid off.

Another thing to notice is the amount of trust management gave him and the team. Without that the rewrite would have probably failed or they would have run out of money or they would have had to incrementally refactor maybe taking more time. We’ll never know.

If you want to read more about the tech choices and the stack read this second post.

An anectode from a solo rewrite I did

I once was hired to work on an unmaintanable app that had to be rewritten.

Coincidentally it was written in PHP as well and this too had a database structure that needed Sherlock Holmes to be deciphered. It took me at least a week of staring at MySQL tables with cryptic names and cryptic fields, googling PHP functions to figure out what happened to the data (most of the DB logic was in the app) and to design a new DB that was sane.

I ended up rewriting the app in a short time in Python (and migrate the data). It worked :D

The scope was smaller though and I had no choices to justify, they needed someone with expertise to bring a legacy app to a known stack and then hand it over.

The good thing about this rewrite is that knowledge of the previous stack wasn’t ultimately required and I was happy to mostly ignore the app code and being able to focus on the data to bring along and the requirements.

Now that I’m writing this, I think the “legacy app” could also be used as an argument in favor of frameworks for less experienced developers working in small companies where they might not have seniors to interact with day to day. But I digress.

Conclusions

Keep in mind that there’s not a single way to accomplish a rewrite, you might pull it off with a mixture of refactoring and rewrite, for more on this I defer to Blaine Osepchuk’s The Rewrite vs Refactor Debate: 8 Things You Need to Know.

If you want to read another success story (seemingly less wild in its premises), read I ran a ludicrously complex engineering project (and survived).