Azure Data Factory Wish List

Posted in :

Ed

Azure Data Factory is by far my favorite ETL tool I have ever used, and I love that since is on the cloud the team is constantly iterating. However, when you live in any app all the time there are things you really wish for, here are some of mine. (This may be / can be an ongoing series.)

Many of these are active request from the community that you can see here on Feedback site

Dark Mode

There are two kinds of developers in this world, those who want dark mode and those that like Blinding Lights. Run Running GIF by The Weeknd
Everywhere I can, I turn my apps to dark mode, including many Azure tools (Azure portal, Data Studio, VS Code, Visual Studio). However, ADF doesn’t natively have a dark mode setting, and it is this super white canvas. There is an open request for it, but so far no word on if being worked on.

Till then, I have found the Dark Reader browser extension is a workable solution, albeit with a tiny JS error.

Execute SQL Activity

Many of us developing in ADF come from an SSIS background. We often used the Execute SQL task to run simple commands as part of our packages. Unfortunately, so far ADF has not seen fit to add this to the complement of activities.

This leaves you with several choices:

  • Stored Proc – put your SQL command in a stored procedure (which if not dynamic means you may have a lot of procs for single uses), this is the one I use the most.
  • Pre-copy script – If part of a copy activity you can execute a sql command on your sink via a pre-copy script (but mind you, not post-copy script exists, so if you want to create a temp table, great, but you can’t drop it, unless you use the aforementioned dynamic proc)
  • Copy Source – If the command is for your source of your copy activity, you can just add to your copy query
  • Lastly, if you doing as part of a lookup, you could add the query there, also a lookup activity can return a result, which a stored proc activity cannot

All that to say, there are no good alternatives, so here’s to hoping for Execute SQL Activity, go vote it up!

Variable Scoping

Image result for variables programming scoping

Maybe this is another one of those, this thing worked in SSIS can I have it in ADF. In SSIS (or .NET more broadly) you could specify how your variables were scoped. Many times this drove me nuts because I scoped it wrong, but at least it was explicit. In ADF it is all scoped to the pipeline. It appears that in a ForEach iteration it is scoped to the iteration, BUT that isn’t explicit, I don’t have any say over it, so I THINK it works. As someone who stupidly has run different flows expecting my variable to change this is a bother to me.

Maybe this goes along with there is no real containers within pipelines, like there were in packages (I know, not far to compare ADF to SSIS, but if that is where you come from).

Now, Azure Data Factory’s recommendation around this would be…Your pipeline should ONLY do one thing…if you want something within another scope, create a separate pipeline and run it there…Then, guess what, your variable is scoped narrowly to that child pipeline. I get why they do it, they are helping force what they think is a good design pattern, and most days I am fine with it, just sometimes, would be nice if were more obvious.

None of these are deal breakers, I love ADF’s ability to parallelize and they constantly improving…but these are minor things I’d love to see worked.

2 thoughts on “Azure Data Factory Wish List

  1. A way around for using Execute SQL in ADF:

    1. Utilize stored procedure activity
    2. Pass the stored procedure name as โ€œsp_executesqlโ€
    3. Add a stored proc parameter as โ€œstmtโ€
    4. Pass in your query & it should do the trick

    Give it a try & let me know if that serves your purpose ๐Ÿ™‚

    1. Yea, I mentioned that you could used a stored proc, but it isn’t ideal, and still a hack, compared to just being able to execute a query.

Leave a Reply

Your email address will not be published. Required fields are marked *