Back to "Working on Legacy Codebases (or a contractor's life)"

This is a viewer only at the moment see the article on how this works.

To update the preview hit Ctrl-Alt-R (or ⌘-Alt-R on Mac) or Enter to refresh. The Save icon lets you save the markdown file to disk

This is a preview from the server running through my markdig pipeline

Legacy Code

Working on Legacy Codebases (or a contractor's life)

Wednesday, 06 November 2024

Introduction

As a freelance developer one of the skill-sets you need to learn quickly is how to work on existing codebases effectively. I've been lucky to have built a bunch of from--scratch systems; this is a JOY as an experienced developer however it's not always the case.

The Challenges

Legacy systems have significant challenges; especially when the key developers / architects (if you're lucky enough to have them) have moved on.

Documentation

This is often overlooked especially in smaller companies. In general, you need 4 key types of documentation:

  1. New Developer guides so how do you run the project locally (more on this in a minute), what considerations are there (if for example you need a different framework than current, especially true with Node sadly).
  2. Deployment Docs in these days when CI/CD is seen as the gold-standard of deployment it's often overlooked that you need to know how to deploy the system manually. This is especially true if you're working on a system that's been around for a while and has a lot of manual steps.
  3. Problem Resolution Procedures; what do you do when the system goes down? Who do you call? What are the key things to check? This is often overlooked but is CRUCIAL in a production system.
  4. Architectural / System-Design Docs; how does the system work? What are the key components? What are the key dependencies? Think of this as your roadmap when learning a new system. It provides a high-level overview of how the system works which can be critical (especially in larger, distributed systems). This SHOULD include which source code repositories are used, which CI/CD systems are used, which databases are used for each element of the application. This can encompass anything from a simple diagram to a full-blown UML diagram of each component of the system. For example in a React Application this would typically include a diagram of the components and how they interact with each other and any backend services. In a .NET Core application this would include a diagram of the services and how they interact with each other (and the front end & other backend systems) and possibly ASP.NET Views and what services they may use.

Documentation is one of the things we as developers often hate doing (it's NOT code) but it's crucial. As you can see I LOVE Markdown, with the likes of Mermaid and PlantUML you can create some really nice diagrams and flowcharts that can be included in your documentation.

It needs to be CURRENT; developers often talk about bit-rot; where a project's dependencies become obsolete / downright security risks. This is also true of documentation; if it's not current it's not useful.

There's varying levels of documentation but in general any time you change code which is referred to in a document you should update that document.

Running the system locally

This is often a challenge; especially if the system has been around for a while. You need to know what version of Node / .NET etc., what version of the database, what version of the framework etc. This is often overlooked but is CRUCIAL to getting up and running quickly.

I've often seen developers saying that this isn't relevant in these days of cloud systems and large distributed applications, but I disagree; you need to be able to run the system locally to debug issues quickly and effectively.

  1. Profiling - this seems to have become an 'advanced developer only' skill, but it's crucial to be able to profile your code to see where the bottlenecks are. This is especially true in a legacy system where you may not have the luxury of being able to rewrite the system from scratch. Tools like dotTrace and dotMemoryare invaluable here. Especially in ASP.NET applications; the biggest issues are generally 'leaks' in the system; memory leaks, connection leaks etc. While your app will seem to run FINE for a bit you suddenly find it's crashing / using up all the memory on the server.
  2. Finding issues in the debugger - Like it or not the days of Response.Write in Classic ASP are over. For an even modestly complex application (especially ones you didn't write) it's crucial to 'step through' code. To follow a request from its inception through the service to identify what MIGHT be happening, what exceptions may not be caught etc.
  3. Logging - Again often overlooked (sorry but no an ASP.NET Core exception filter at the top of the stack isn't enough). Referring back to my previous post, logging is a critical part on working in code LOCALLY. Remember you can use customer logging types (for instance using the Logging Source Generators) in ASP.NET Core. You can then specify which event types should be logged and which shouldn't. This is crucial in a legacy system where you may not have the luxury of being able to rewrite the system from scratch. For Application Insights you may want User Journeys and other features; however be aware this becomes expensive fast if using LogInformation for every request. You may want to use a custom Telemetry Processor to filter out requests you don't want to log.
  4. Testing - Too often I see developers 'testing in preprod' or thinking that Unit Test then check in is ENOUGH in terms of testing. Its RARELY IS, there's a lot of value in both using a tool like Resharper / NCrunch to automatically run unit tests; however remember Unit Tests are just that; they test a 'unit' of code. You need to retain visibility in how the system actually runs in concert with other components. This is where Integration Tests come in; they test how the system runs as a whole. You can use a tool like SpecFlow to write these tests in a human-readable format. Again; crucial in a legacy system where you may not have the luxury of being able to rewrite the system from scratch.

In EVERY project I work on the first step is getting the system (or a large part of it) running locally. By seeing the code, running the code, debugging the code you can get a feel for how the system works.

The Codebase

In every system this is your source of truth, no matter what the docs say, what others tell you about how it SHOULD work this is how it DOES work.

This often challenging, it's like finding your way in a new city with no roadmap. Luckily in applications you have an entry point (a page loading / a front end API call etc.) pick a point and start there.

Use whatever you need to interact with it, whether it's PostMan, Rider's HttpClient or even a web page. Hook in your debugger, make the call and follow it through. Rinse and repeat for each part of the system.

Refactoring

Generally LEAVE THIS UNTIL YOU UNDERSTAND THE SYSTEM. It's ALWAYS tempting to 'throw it away and start again' however resist this temptation. Especially for a running system IT WORKS rebuilding / even refactoring a system is a HUGE risk. Yes it's FUN but for every line of code you change you risk introducing new snd exciting bugs.

Like everything else (especially when contracting) you need to justify the work you perform on a system. Either this justification needs to be focussed on one of the following:

  1. Performance - the system is slow and needs to be sped up (again be careful, what YOU find slow may be how the system is designed to work). Making one part smoking fast may introduce issues further down the system. Will you 'kill' the database server, if you make too many requests do you have a mechanism which won't cause cascading exceptions?
  2. Security - the system is insecure and needs to be made secure. This is a tricky one; especially in a legacy system. With bit-rot you may find that the system is using old versions of libraries which have known security issues. This is a good justification for work; however be aware that you may need to refactor a LOT of the system to get it up to date; again leading to the 'new bugs' issue.
  3. Maintainability - the system is hard to maintain and needs to be made easier to maintain. In general this becomes a lot of work; does the current lifetime of the codebase justify this? If you're spending more time making these changes to improve maintainability than you'd ever save for the customer then it's not worth it (and again, changed code == new bugs).
  4. User Experience - I generally prioritise these issues. For a user it doesn't matter if your code is 1% better; they are the ones who PAY for the work you do in the end. If you can make their experience better than it's generally worth it. However, be aware that this can be a 'slippery slope' of changes; you may find that you're changing a LOT of the system to make a small change for the user.

The Business Of Working On Legacy Systems

This is often overlooked; especially by developers. You need to be able to justify the work you're doing on a system. This is especially true in a contracting environment where you're paid by the hour. In the end, it's not YOUR code and not your money. The WHY you're making a change is often more important than the change itself.

I've worked as a contractor for over a decade now, it's not EASY; as a contractor each hour of your time is a 'cost' to the customer. You need to add move value to the system than you cost. If you're not then you'll quickly be looking for a new contract.

As developers, we tend to be crappy business people we're focused on 'perfection' at every turn. In reality, you don't need to make a system 'perfect' (I'd argue there's no such thing); you just need to deliver value to the customer.

On a longer term engagement this INCLUDES ensuring any new code is maintainable and cost-efficient to run. In legacy systems this is MUCH HARDER. You often have to wade through a swamp, being anxious that as you learn the system you CANNOT offer much value. I'm not making any changes you think I'm just learning the system. This is a fallacy; you're learning the system to make it better. You're learning the system to make it more efficient. You're learning the system to make it more maintainable. If a customer cannot accept this step; then you need to be very careful about how you communicate this (or look for a new contract).

The People

Again often overlooked, a lot of the time you're brought in as a contractor because some key person has left (don't get involved in the politics of this; it's not your concern). You need to be able to work with the people who are there to achieve the goals of the project. In a contract you'll generally have the terms of your engagement spelled out (in my current contract it's 'improve reliability and reduce running costs'); focus on this. If you're not sure what this means then ASK.

Keep in mind who your direct contact is; especially in the first couple of months. Keep them informed (you likely won't have new features to brag about as you get spun up on how the system works). I generally send a summary email every week / two weeks to my direct contact; this is a good way to keep them informed of what you're doing and what you're finding.

Remember, this is the person who will approve your invoice; there should be no surprises at invoice time. Once you start checking in code regularly this is less of an issue; you have a record of exactly what you did and the impact it had. Until then; you need to keep them informed. They need to know what you did and why they should pay you for it.

Reliability

Again, back to the legacy code; if you make a change in general you need to deploy it.Even the best of us will FUCK THIS UP from time to time, especially in legacy code systems there will be something you couldn't know when you deploy. This gets back to logging - if you have a staging server have it logging a LOT (but retained for a short period) then when you deploy to THIS you can gather more information about what failed.

No matter how good you are, how much local testing you've done we are all human. This is a kep part of the 'don't deploy on a Friday' rule; expect a deployment to cause a new issue on a system. Be prepared to work until it's resolved. If you don't KNOW why it failed, add more tests to reproduce the issue and more logging to ensure you catch similar issues in the future.

Especially for production systems your staging system may not be 1:1 (especially where load is concerned), tool like k6 can help you simulate load (even better locally where you can do proper profiling as mentioned previously).

Deployment

Again often overlooked in the fervour for CI/CD is the WHY of these. Simple; YOU WILL FUCK UP. Having a very quick and efficient way to deploy a system means that when you DO break it you can also fix it more quickly. If your CI code review system means it takes 2 days to get a PR merged then that's the quickest you can reasonably fi a system. If your CD system means that you take down the running system; get used to LONG nights.

An efficient mechanism to fix and deploy code is essential to an efficient development pipeline. If it takes longer to deploy a fix than it took to find and implement that fix in code then you're less likely to fix stuff.

Getting Paid

Now that we have all this out the way there comes the thorny problem of payment. I have a general rule:

Any payment issues I'm looking to move on

If they're late paying; depends on your contract but at LEAST within 30 days but generally closer to 7; this is a bad sign. If there's quibbles over your invoice (nickle-and-dimeing) think about whether the company is capable of paying for your services on an ongoing basis.

Don't mess around with this; if you've done your job you need to be paid in a timely fashion. It doesn't matter how much you enjoy it; if a bad customer CAN exploit you they will. It's NEVER worth it.

Be honest; only charge for time you actually worked; ensure it's clear what you did and why you did it.

I've led teams of developers and I've been a developer; I've been a contractor and I've been a customer. I've seen all sides of this and I can tell you; if you're not getting paid on time then you need to move on. On the flip-side if you have a contractor (or a FTE) who's not delivering then you need to address this quickly. Everyone has personal struggles but especially in a contract environment you need to be delivering value or not charging for time when you aren't.

As for rate; you will find your level; personally I charge more for projects where I have more responsibility (or which don't look like fun). I charge less for projects where I'm learning a new technology or for startups. I've also worked fewer days but kept my rate steady. But don't accept a low-ball rate; you're a professional and you should be paid as such.

In Conclusion

Well that's it. I'm off work today and tomorrow for my gran's funeral and frankly panicking a little that I have a HUGE legacy system to learn, so I thought I'd vomit out my thoughts on what working on these systems is like & the challenges.

logo

©2024 Scott Galloway