May 1, 2016

Why I prototype Qlik apps in EasyMorph before creating them

If you want to create a Qlik app just create it, why would anyone build a prototype in another tool first? Isn't it just a waste of time? For simple cases -- probably yes, but for complex apps prototyping them first allows designing apps faster and more reliably. Here is why:

When developing Qlik apps with a complex transformation logic one of the main challenges is to deal with data quality and data structure of poorly documented source systems. Therefore the most time-consuming phase is figuring out how to process data correctly and what can potentially go wrong. There are many questions to answer during this phase, for instance:
  • How do we merge data -- what are the link fields, what fields are appended? 
  • Does any combination of the link fields have duplicates in one or the other table? 
  • Do the link fields have nulls?
  • Are nulls actually nulls or empty text strings?
  • Are numbers actually numbers, not text?
  • Do text values have trailing spaces?
  • After we join tables, does the result pass a sanity check?
  • How can we detect it if the join goes wrong on another set of data (e.g. for another time period)?
  • Are dates and amounts within expected ranges?
  • Do dimensions have complete set of values, is anything missing?
  • When dealing with data in spreadsheets
    • Are text and numbers mixed in the same column? If yes, what is the rule to clean things up?
    • Are column names and their positions consistent across spreadsheets? If not, how do we handle the inconsistency?
    • Are sheet names consistent across spreadsheets?
These are kinds of questions you would want to have answered before believing that a transformation works correctly and reliably. Unfortunately, scripting in Qlik, like any other scripting, is poorly suitable for interactive data profiling. And that's where you start wasting your time. To answer questions like those above in Qlik you would need to modify and re-run script many times and every time Qlik will reload same data again and again. Even if it's a small test dataset in a QVD, it still takes time, or you risk profiling too little data. Oh, and if you mistakenly omitted a comma your app is dropped and you need to re-open it and re-run again. Then after each reload you would have to create some one-off layout objects and write some one-off expressions, just to get one question answered. Tricks like inspecting resident tables make life a bit easier, but just a bit. Overall, it's an inconvenient and cumbersome process. It effectively discourages data exploration, which sooner or later leads to errors that are discovered on late stages, where the cost of redesigning and fixing errors is much higher than on earlier stages.

This is where EasyMorph comes in handy. First, it loads data once, then keeps it in memory, therefore, it doesn't have to be reloaded every time. And if you load sample data only one time, why not use a bigger data set which is usually better for data profiling? Not only does EasyMorph load data only once, it also keeps in memory results of all successful transformations. So if an error occurs, you continue from where it stopped, not from the beginning -- another time-saving feature.

Second, EasyMorph runs transformations automatically in the background after any change. It's like if you are writing a Qlik script, and while you're writing it after any change or new statement Qlik runs the script proactively, without requiring you to press Reload. Except Qlik doesn't do it. Basically, transformations to EasyMorph is what formulas are to Excel -- you change one and immediately see a new result of calculations, regardless of how many formulas/transformations it took.

Third, designing a transformation process visually is much faster than scripting. Some Qlik developers are exceptionally good at writing scripts, but even they can't beat it when a whole transformation like aggregation is created literally in two clicks. If one knew exactly from the beginning what a script should do then writing it quickly would not be a problem. It's the numerous iterative edits, corrections and reloads that make writing Qlik scripts long. Once I have designed and debugged a transformation logic in EasyMorph, scripting it in Qlik is usually a matter of couple hours, and it typically works reliably and as expected from the 1st run.

Another important advantage of prototyping Qlik apps in EasyMorph is that it allows creating a reference result. When you design a Qlik application off an existing Excel or BI report it usually makes the task easier because numbers in the old report serve as a reference you can compare against. However, if you design a brand new report there might be no reference at all. How can you be sure that your Qlik script, expressions and sets work correctly? There is a whole lot of things that can go wrong. Building a prototype in EasyMorph gives you that reference point and not just for the script, but also for expressions, including set analysis. In airplanes, measuring crucial indicators like altitude and velocity must be done using at least two probes (for each metric) that utilize different principles of physics so that pilots can be sure it's measured correctly. The same principle here -- "get another reference point".

I also found that designing apps in close cooperation with business users is more productive when the users have good understanding of how a transformation logic works. It's better explained by letting them explore a visual process in EasyMorph rather than showing totally cryptic (for them) Qlik scripts.

Resume: EasyMorph is a professional tool which can be used by QlikView / Qlik Sense developers to create robust and reliable applications faster by prototyping them first. I do it myself, and so far it works pretty well.