Musings on Conway's Law
"Organizations which design systems ... are constrained to produce designs which are copies of the communication structures of these organizations."
— M. Conway
For some reason I've always loved Conway's law ... and in this article I'm going to dig deeper into it and see what's behind it. I think I might find some hidden gems.
So is it true?
Based on my own experience I absolutely, completely think it is true.
This is based on intuition more than anything though. The question is can I find any hard evidence to back this up? In preparation for this article I did some more research to see if anyone more reputable than me (which wouldn't be hard), had tried to assess how true Conway's claim might be. And I found some really interesting stuff. Who knew!
Top of the list of reference papers in Wikipedia is a Harvard Business School paper by Alan MacCormack, John Rusnak and Carliss Baldwin. This is an excellent read which purports to prove, for one limited enough product/system type example, that software systems designed by more tightly coupled organisations, in this case commercial private sector software companies, are more tightly coupled than software systems developed by much more loosely coupled "organisations" in the open source sector.
To test this they used a specialised code analysis tool to measure the level of inter-dependency - direct and indirect - in the code of systems listed below.
This tool measured the impact of a change on the code - how many parts of the code were impacted by the change? They postulated (without much detail I could see), that the open source systems were developed by more loosely coupled organisations, which is a fair enough assumption you would have to say. What they were not able to do was provide any evidence that the closed source systems were developed by organisations representing tightly-coupled org structures. It's a fair assumption but there is little evidence presented in the paper, that I could see, to back it up.
As detailed below, in the 5 systems types they compared, changes to loosely coupled, open source systems had a lower overhead than equivalent change in the private, commercial off the shelf systems.
What the results show is that, by their measures, the "propagation cost measures" are much higher for systems that are the product of tightly coupled organisations. This is certainly an interesting finding but it's a long way from a slam dunk proof you would have to say too. For one it's a pretty limited sample group and more critically there's not much real detail provided on the actual org structures at play in each of the organisations selected.
So I decided to keep on looking. Next!
This is a Microsoft research paper and as I dived into it I realised that this time, the researchers had gone much deeper into understanding the org structure itself. And since the source was the Microsoft team that created Windows Vista (don't hold that against them), they were going to have a lot of really good data to work with.
We are talking a team of several thousand, 3404 separate binary files with over 50 million lines of code. With access to Microsoft's own version control system, they were able to gather really detailed data on factors such as
number of engineers to change the code in a particular file
who those engineers were
where they fitted in the org structure and
the relative quality of the code as measured by the number of post release fixes that had to be applied to the code file in question.
To kick things off here's how they summarised the key questions they hoped to answer in their paper:
How does organizational complexity influence quality?
Can we identify measures of the organizational structure?
How well do they do at predicting quality, e.g., do they do a better job of identifying problem components than earlier used metrics? [Link]
What I found particularly interesting was how they came up with measurable characteristics from the org structure. For each binary file they assessed the following attributes:
Number of Engineers (NOE)
This one should make you think of the Mythical Man Month! Do more engineers mean better code?
Number of Ex-Engineers (NOEE)
Does staff turnover impact code quality?
Edit Frequency (EF)
Is this a factor? Do more edits indicate something material?
Depth of Master Ownership (DMO)
This measure allowed them to cross check the engineer who made changes to the code with their line manager. The more engineers that made changes, who also reported to the same manager, the deeper the ownership (in their system). The converse of this indicated that ownership of the code has to land farther up the hierarchy in the form of one of the manager of manager layers.
Percentage of Org contributing to development (PO)
This is defined as "the ratio of the number of people reporting at the DMO level owner relative to the Master owner org size". This is closely linked to the previous measure DMO. The lower the percentage the bigger the total team size typically, when teams are measured based on who "owns" the code in the hierarchy.
Level of Organisational Code Ownership (OCO)
Defined as "the percent of edits from the organization that contains the binary owner or if there is no owner then the organization that made the majority of the edits to that binary", this measures the degree to which engineers who change the code work for the same "organisation" (they don't really define organisation here though - I think they mean all the people under a single manager in the hierarchy).
Overall Organisation Ownership (OOW)
"This is the ratio of the percentage of people at the DMO level making edits to a binary relative to total engineers editing the binary. A high value is good." Or in other words the more tightly the edits are spread e.g. within a single team, the better.
Organisation Intersection Factor (OIF)
"A measure of the number of different organizations that contribute greater than 10% of edits, as measured at the level of the overall org owners.
These are all interesting measures (though they do seem to want to make them sound more complex that they need to ... researchers eh!). Thankfully, the also included the following table to help us understand what they meant in simple English:
So what did they find? Well, I am going to save you the trouble of reading through all of their detailed workings and let you know that they found the assertions above to be the best method of predicting code quality they could find when compared to other measurable factors such as Code Churn, Code Complexity, Dependencies, Code Coverage or Pre-Release Bugs.
Or as they say themselves:
Our organizational measures predict failure-proneness in Windows Vista with significant precision, recall and sensitivity. Our study also compares the prediction models built using organizational metrics against traditional code churn, code complexity, code coverage, code dependencies and pre-release defect measures to show that organizational metrics are better predictors of failure-proneness than the traditional metrics used so far.
So there you go. Interesting. So what does this all mean? Well if you want to be a real IT Architect it is something I think you really need to think long and hard about because, if it's really true (and the data points that way from what I can see), then you'd better not be trying to design systems that conflict with the communication structure of the organisation in which it needs to work - for example dynamic, deeply software integrated systems in a waterfall structured org ...
Unfortunately for many of us today this is exactly the scenario we are facing and even worse, for most of us, we have very little hope of fixing it in the short term if ever.
Unless .... how about we try and answer this question ...
If Conway's Law is really true, what kind of org structure would facilitate high quality digital systems?
In my next post on this topic, for the fun of it, I am going to take the factors below and see what an optimally designed org structure might look like.
Number of Engineers (NOE)?
Number of Ex-Engineers (NOEE)?
Edit Frequency (EF)?
Depth of Master Ownership (DMO)?
Percentage of Org contributing to development (PO)?
Level of Organisational Code Ownership (OCO)?
Overall Organisation Ownership (OOW)?
Organisation Intersection Factor (OIF)?
Till next time.