System Monitoring – Part 1

This is the first part of a set of articles about systems monitoring for software developers. I am writing about this as it is something I have been working with a fair amount over the last 6 months in my current role here in the UK.

The company I work for is pretty typical for a large firm. We have a mix of different legacy systems that we have to keep running as we try to update them, and integrate new systems. A lot of these systems vary in quality and the majority of the original system developers no longer work for the company. When any of these systems go wrong, it can be quite difficult to diagnose what the problem is.

System Monitoring - Programmers Desk
Taken by Duncan Verrall

As an example, we have one particular system that handles file synchronisation between multiple sites that runs over night. This system does write out log files, but they are the most unfriendly log files I have ever seen. For a start they are so large, people do not even bother to look at them anymore. Even if you do open up these files, the formatting is so strange and complicated you really struggle to see what is going on. This again is why people don’t bother to look at them. This particular system is one of the more extreme cases. Other systems operate a mix of writing data to log files or adding huge amounts of data into a logging database. In some cases the amount of data logged is quite extreme, which makes searching of meaningful information in the event of a system failure extremely difficult. Even more so if you are under pressure with a system outage which is having a direct revenue impact to the business.

In the rest of this series, I will talk about the system I designed and developed. I will cover the design and implementation. I will also talk about some of the problems I faced, and how this system has helped divert a few major live incidents. For obvious reasons I can’t discuss actual systems and internal process at my company as I don’t think they would like that too much, so some details have had to be changed or omitted, but that really doesn’t matter for the purpose of this article. The intention is to discuss the architecture of the system. I hope this series will help you in thinking about your monitoring needs in your own organisation.

The Obligatory First Post

Image supplied by
Image supplied by

Wow, this is the first post!! It feels a little empty around here at the moment. My name is Stephen Haunts and I am a software developer currently working in the financial services industry in the United Kingdom. I have decided to start this site as a way of capturing some of my ideas and sharing them with the world.

I mainly work in C# and .NET. This is because that is what my job requires, so I will tend to bias towards the Microsoft Development Stack when talking about actual software development. I also enjoy working with .NET; it is a fun system to write software with.

I don’t intend to just talk about code. This site covers more of the architectural aspects of enterprise software development. Most of these ideas can sit above the actual language and implementation details.

I have ideas for different series of articles that revolve around ideas that I have been thinking a lot about recently. As a little hint to what is to follow, some of these themes are around applications system monitoring, PCI DSS, Zero Downtime deployment models, Agile/Lean development and software development leadership.

I hope you find the content enjoyable, interesting and thought-provoking. Please subscribe to the blog using the subscription widget to the right of the screen. You can also use the sites RSS feed in your favourite news reader. Also, please comment and contribute to the articles if you feel you have something to say. I don’t expect you to agree with everything I say, but the initial articles serve as the starting point of the subject. They can be extended much further by your contribution to the conversation.

Well, that’s the obligatory first post out-of-the-way. That wasn’t as scary as I thought it would be.

Participate with Coding in the Trenches on Facebook
Participate with Coding in the Trenches on Facebook by Click the button above.