On Less is More
Thelonius Monk once said “Wrong is Right”. I say, “Less is More”. All too often we as software developers do data collection of one sort or another, often storing results in a database table or tables, and we suffer from self-induced overkill.
We collect too much data, data that we probably will not need. Or, instead of storing the same data and simply updating it’s count on a unique column value via an Insert or Update SQL statement, we end up storing hundreds of unique rows that, because of the data collection overkill we’ve engineered, take up lots of space but don’t really contribute to the “cause”.
In addition (and I have certainly been guilty of this) we store our data in database tables that are not normalized, thereby exacerbating the situation. We end up with wide tables with a lot of columns that are inefficient.
It is often much easier (and simpler) to start out with a minimalist approach. Less is More. If we determine at a later point that we actually do need “More”, we can always add that later. I believe it is easier to add needed features to a well-thought-out basic design than to remove stuff later. It’s human nature.
You are quite right - we become so worried that our applications won't do everything that's required of them that we load them full of speculative functionality. It's taken a few years but I'm getting the hang of YAGNI.
ReplyDeleteEven with my BI hat on what you're saying about well structured (normalised) databases holds true. Every database should exist in 3NF somewhere - even if your application uses strange denormalised structures to improve performance, these should be nothing more than derived staging tables.
A good schema should be expandable while remaining backward compatible. Similar metrics to the interface members rule apply here: tables should have between 5 and 12 fields, but definitely no more than 20.
I've applied this principle to a dataset with 1098 fields and 12m rows of data annually, and it's made analysis and integration a trivial exercise. The more I work with data, the more I realise that Dr Codd really knew what he on about.
@Ben,
ReplyDeleteThanks for your expert comment. I work with OPC (Other People's Code) all the time and I never fail to be amazed and disappointed at the amount of unncessary complexity that developers weave into their creations.