March 22, 2015

Data transformation in QlikView without the loading script

While I admire QlikView's genius associative in-memory engine I'm not a big fan of QlikView's loading script. If QlikView wasn't meant for departmental use, it would not have been a problem. But since QlikView is heavily oriented on business departments with limited IT supporting staff, the necessity to deal with transformation logic encoded in a proprietary script syntax is an obstacle because of several reasons:
  1. It makes it harder to promote QlikView in organizations, since readability of QlikView script is no better than of Visual Basic macros in Excel spreadsheets or SQL.
  2. Since business users usually can't write/read the scripts (they are not supposed to have programming skills), it inflates development and maintenance costs because even minor changes to the logic have to be made by developers and developers have to answer all questions about how this or that is calculated.
  3. Total lack of metadata lineage. To make it even worse, the dollar-sign expansions make it impossible to parse scripts and rebuild abstract syntax tree in external applications.
With QlikSense the situation is even worse, because QlikSense is targeted as a self-service data discovery platform, but the need in loading script makes it dead on arrival.

I believe that data transformation can be done without programming in the vast majority of cases. That's why I designed EasyMorph that can be integrated with both QlikView and QlikSense.

Today we've published an example of QlikView Integration. It shows how data transformation logic can be moved from a QlikView loading script into an EasyMorph project, where it can be visually designed, modified and viewed. While it doesn't replace the loading script completely (yet) it demonstrates how core transformation logic can be made accessible for non-technical users, who can now explore and modify the logic in a visual way, without dealing with the script syntax.


Here is its transformation logic. Input and output transformations are marked on the screenshot.


The logic loads a text file (one file per state), maps state names from another file, does a few calculations and aggregations, and exports the result into a QVD file. Here is the loading script with EasyMorph project integration (comments removed, download the example for the full version):

States:
LOAD * Inline [
State
California
New York
Texas
];

LET pathEasyMorph = GetRegistryString('HKEY_CURRENT_USER\Software\EasyMorph', 'ApplicationPath');

let vCount = NoOfRows('States');

FOR I = 0 to vCount - 1

    LET State = Peek('State', I, 'States');
   
    //create input file for the EasyMorph project
    EXECUTE cmd /C copy /Y "Inc5000 $(State).csv" input.csv;
   
    EXECUTE $(pathEasyMorph) /c /run "Inc5000 for QlikView.morph";   

    //read the QVD generated by the EasyMorph project
    LOAD
        *,
        '$(State)' as State
    FROM output.qvd (qvd);

NEXT I


Since EasyMorph can't process a list of files yet (this feature will be added in future releases) a simple workaround is used -- a cycle, that iterates through a list of file names, and renaming.

For convenience the EasyMorph project can be opened right from desktop QlikView (in example there is a button for this).

For server execution, you might need to specify the path to EasyMorph executable explicitly, since EasyMorph is installed under a user account and may not be available in the registry under QlikView Server's account.

While it's not shown in the example, you can also add a link to the HTML documentation automatically generated by EasyMorph when it runs a project with /doc parameter from command line (see help on this). In this case users will be able to see the documentation right in the web-browser, next to the application. Or, right in the application, using an extension for web page embedding.

Links:
EasyMorph website
QlikView Integration example

PS. Technical details are copied from EasyMorph Blog.

March 5, 2015

EasyMorph as an ETL for Tableau

Tableau is a great tool. I've been watching it for a few years and I like the way it's developing -- more sophisticated computations, better performance, faster visual engine. Tableau aims at analysts without technical background -- an audience which probably is most under-served in terms of tooling than any other category of office workers. All what they usually have for their data analysis needs is Excel, love it or hate it.

At the same time I struggled to understand the conceptual breakthrough behind Tableau. While it's often considered a prominent Data Discovery platform, when it comes to data transformation it's not significantly less dependent on IT staff than traditional Business Intelligence platforms. Just as traditional BI, Tableau needs well prepared data, preferably residing in a database. And when it comes to data manipulation in databases all user-friendliness ends -- last time when SQL was user-friendly was in 1980s.

In my understanding the reason why Data Discovery started outpacing traditional BI is that it can liberate business teams from dependency on IT folks. Data Discovery tools must be self-sufficient by definition. It means that users should be able to collect, clean, transform and prepare data for analysis themselves. And this is why they still love Excel -- because it makes it possible, because it lets them control. And that's why QlikView is more self-sufficient -- because it has its own ETL. While QlikView has its own pros and cons (personally I'm not a big fan of its loading script), but in vast majority of cases it has all you need to prepare data for analysis and visualization. But what about Tableau?

I've designed EasyMorph, an easy-to-use ETL tool intended for exactly the same target audience as Tableau -- knowledge workers without a technical background. Like Tableau, EasyMorph is highly visual and it doesn't require SQL skills or programming. Like Tableau, it's fast and lightweight, and it doesn't demand weeks of training. My goal is to make EasyMorph a perfect companion for Tableau and allow Tableau to become a truly Data Discovery platform, owned by business users. Today EasyMorph made one more step towards that goal -- we've added support for Tableau Data Extracts.

If you're a Tableau user, give EasyMorph a try -- maybe this is something that you were missing in your Tableau experience. Feel free to drop me a few lines with your thoughts on it. You can find my email in the right sidebar of this blog. Thank you!

UPDATE
In 2015 EasyMorph has become a Tableau Technology partner. It has got a server edition and deeper integration with Tableau Server.

UPDATE #2
Apparently Tableau understands the lack of built-in data preparations capabilities -- in October 2017 they've unveiled Tableau Maestro, a light-weight data preparation tool. You can think of EasyMorph as "Maestro on steroids".