November 4, 2015

800 transformations

Today one our customer showed me an EasyMorph project that they run every day. It has 800 transformations and performs consolidation of financial data received from multiple counterparties. These counterparties send data in non-uniform formats hence complex consolidation rules. We use some large projects internally at EasyMorph (these are test cases for functions and transformations), but the largest one has around 400-500 transformations (and I thought that was big). Here we have 800 and it didn't strike the customer as something unusual. Just a daily routine. Which led me to a few thoughts:

Apparently, the concept of modelling data transformations, that we use in EasyMorph, works rather well. Just to remind you, instead of connecting individual transformations with arrows (which is typical for the vast majority of ETLs), transformations in EasyMorph are organized into something like function pipelines, where output of a transformation implicitly becomes an input of a following transformation (see how it works on http://easymorph.com). This eliminates the need to explicitly define data flow between transformations with arrows. Imagine what a mess such project would have been if it had 800 arrows, connecting 800 blocks!


This approach allows EasyMorph to pack transformations much more densely than traditional tools, and operate with logical blocks of transformations, rather than individual transformations, which provides a higher level view and abstraction. This makes EasyMorph conceptually closer to [functional] scripting languages than to ETL, but since it's highly visual working with EasyMorph feels more like working with a spreadsheet, rather than programming. It's worth noting that it took the customer only a few days to create the project, not weeks or months -- yet another confirmation that visual designing works better than scripting.

The more examples of use I see, the better I understand (and I maybe only a few steps ahead of our users in this regard) what's the best positioning of EasyMorph. It now is becoming clear that it's rather well suited for modelling complex and very complex business logic on moderate amounts of data (much more complex then I initially supposed). The picture below illustrates this:

Here we have two axis -- amount of data (vertical) and complexity of business logic (horizontal). Nowadays, everybody is talking about Big Data and looking for ways to handle large data volumes, since it's increasing exponentially (at least that's what we're told). EasyMorph targets a problem orthogonal to Big Data -- it addresses complexity of business rules, offering ways to easily design and maintain rather complex logic, using approaches and transformations that sometimes simply not possible for SQL-based tools.

And it looks like it works. 800 transformations.

PS. I usually have difficulties explaining people what EasyMorph is because there is no commonly accepted terminology that can be used for it. Is it ETL? Kinda yes, but, emmm.... not really. Actually, not at all. It's a bit like programming, but without programming... Like Excel but for tabular data...Well, it would be easier to just show it to you.