Imagine this situation: You’re working on some code and you get an exception when you run the unit tests. Next to the output is a link with the text: “User Joe had the same exception two months ago and fixed it with the commit b8cfda02.”
How would that work? We’re using big data for all kinds of things, tracking customer happiness, searching the Internet and discovering terrorist threats (or not).
Standard development teams have about 10 people. That means you have a super computer with 40-80 cores, 160 GB of RAM and 20 TB of disk space connected with a fast LAN in your office already. That beast is usually idling while it waits for the developers to press keys. It would be pretty simple to install a clustered log analyzer on this hardware which simply reads all the log files and reports which Maven and running JUnit test creates. It would be as simple to connect the same database to your version control. That means this system could track all the errors and exceptions that you get when you run unit tests or the whole application.
This information could then be used to detect when someone in the team gets a new exception plus the change sets which fixes them. If the system detects an exception which it has seen before, it can tell you which developer has fixed it or who is currently working on it – instead of wasting your time, you could see the code which contains the solution or ask someone who has already solved the problem.
With proper filtering, the data could be split into internal and framework code. That way, the system could report to library projects where consumers struggle most.
On the large scale of things, this system can tell you which parts of the system are most brittle.
As usual with big data, there are some downsides. The same system would tell you which developer breaks the code most often. Who writes the worst code. If your manager isn’t able to see the human value in his charges, this might not be your best bet.
Related Articles:
- The Next Best Thing – Series in my blog where I dream about the future of software development
Tagged: Big Data, Software Development, TNBT
