How many Book1.xls do you have?
Like almost every firm out there you’ve probably got a whole bunch of these files on your server.
The IT department hate it. But to me, these files indicate something important.
These files are an indication that some staff are engaging in exploratory analysis. An analytical playground, if you like.
I see them as an indication that the piece of information they are looking for is missing something or not in the right format.
Now, while not all book1’s are cases of solid, successful data exploration, they are signs of early exploration, scratching the surface to get some traction with the data they have.
Once the exploration starts to mould into something more valuable, the book1 label is discarded. A good sign people are taking ownership of their work.
So how many Book1.xls do you have?
The biggest issue with all these book1’s is that you have no idea what’s inside them.
You can’t classify the files, you can’t analyse their statistics or how many times they’ve been used.
It’s unlikely you won’t understand what analysis the file was a precursor to. That’s probably the case even if you originally created it.
To solve this, you could get all technical and run some deep analytics on them using the likes of MapR, Talend, Jitterbit, etc.
This would throw up more information about the files:
- Data tables
- Sheet names
- Common column headings
But these are all out of context and highly variable.
These Book1’s are just a scratch pad of someone’s thoughts and it’s likely there’s not a great deal of context in them anyway.
You’ll never know.
A different kind of playground
You could harness this data exploration by channeling it through a better medium.
What if you were able to collect all this analytical power for good. By capturing the individuals thought process and the journey they went on to create their Book1 you could get a valuable insight into their focus.
But there’s more. What if you could offer peer review of their findings?
What if you could communicate with them on the topic and even offer some help?
Perhaps suggest some additional or complementary data sources that might strengthen their analysis.
Better still, let them explore and communicate with the internal DE community to see what other work has been done in this area.
It would be like an open source forum for data exploration within your own organisation.
Get outta here!
It would create and promote a data engineering culture where anyone in the organisation is free to explore and examine data. Plus, they’d have the help of Data Engineers and access to whatever data they needed (or were allowed to see).
It would be group think to the extreme!
You wouldn’t capture all the analysis going on but you’d get a lot of it.
You would also bring together the inquisitive minds in an unenforced environment
And you’d save time; a lot of time.
Most importantly, you’d see less and less Book1.xls
Give your staff access to the right BI tools and things will rapidly change