Do you do that thing at the end of each month where you produce your reports, save all your files, close them down and then take a big sigh of relief?
Only to find that next month, you have to refresh your memory about what you did:
- How did you get the data?
- Why are your links broken?
- Why doesn’t your spreadsheet work?
- How did you copy over those values?
- Why are these cells raw values and not formulas?
It just seems like this is an uphill battle. Surely there’s a better way, right?
EXCEL IS NOT YOUR FRIEND
The problem is caused by the tools you have available to perform the job you need to do. You’re not alone.
It’s likely most of this work is carried out in Excel. The majority of all report generation involves Excel at some point along the way.
But that’s the issue!
IS THERE A BETTER WAY?
Fortunately, there are a number of really good data prep tools out there that will allow you to not only perform the data transformation you need to do but record the processes so that you can reuse that process in the future.
It’s like recording a macro in your old friend Excel!
Each month is then only a matter of pressing the refresh button.
These tools automate most of the processes you perform manually to extract data out of various systems, transform it into the right shape and then load it into whichever database, excel spreadsheet or reporting format you need at the end.
These are called ETL Tools.
Extract, Transform & Load
Retrieve data from a database, flat file or other data source
Modify or convert the extracted data into the required format. (Nothing happens to the source data)
Load the transformed data into the target database or repository
WHAT TO LOOK FOR IN AN ETL
If you’re evaluating ETL solution, it’s important to prioritise these core functions and inclusions:
- The ability to connect to, and extract data from, your data sources
- User friendly environment that supports drag and drop
- Collaborative development capabilities for integration
- Basic data transformation functions, such as data type conversion, date reformatting and string handling
- Data profiling software that can analyse source data for consistency, dependencies, etc.
- Data quality and cleansing functionality for identifying and fixing errors in data sets, plus data synchronisation for keeping data consistent in source and target systems.
- Metadata management support for synchronising integration processes and documenting data transformation and business rules.
- Job scheduling and process management controls, including monitoring and error handling and logging.
Sounds a bit too hard?
ETL tools have been around for ages but most of these required some level of technical skill in programming, SQL script, or other language. And they’d be quite restricted in what data sources you could interrogate.
Nowadays we have ETL tools that are drag and drop, point and click and you can save them as files to share with co-workers. If you can use Excel, you can use a modern ETL.
These tools also connect to virtually any data source you’d come up against, inducing Excel, databases and cloud data source …. even non-structured data.
The beauty about them being visual is that anyone can understand the processes that is being performed.
Probably the best thing of all is that when you open up next months’ end of month reports everything is ready to go and the update process is nice and simple.
7 of the best ETL tools
Here are 7 options which are pretty easy to use:
- MS Power BI – https://powerbi.microsoft.com/
- Alteryx – https://www.alteryx.com/
- Pentaho – https://www.hitachivantara.com/go/pentaho.html
- Groovy ETL – https://github.com/ascrus/getl
- KNIME Analytics Platform – https://www.knime.com/knime-software/knime-analytics-platform
- Talend – https://www.talend.com
- Rapid Miner – https://rapidminer.com/
Each of these data prep tools options have pros and cons so it pays to try them out and see if they work within your environment.
Remember, your data is the most important component here so make sure the tool works with it, rather than trying to make your data work with the tool.