So exciting to see two new data views inside of Salesforce Marketing Cloud! Huzzah! But that then begs the questions…WHY is this so exciting? What can we DO with these? My intention is to help explain this some and provide some strong examples of what you can do with these great new data views. Now, there are likely a ton of amazing possibilities that I will not get to, so please do keep in mind that this is most certainly not an exhaustive list.

Before getting into what is possible and what we can do….let’s first jump into exactly what these are and the restrictions, context and gotcha’s and into WHY its exciting.

Automation Data Views?

First off, there are two different Data Views that have been introduced, _AutomationInstance and AutomationActivityInstance. These two Data Views provide unprecedented access to the individual instances, or each individual run, of an automation as well as the individual activities inside each run.

This means that what previously would require you utilizing the SOAP API objects or some undocumented REST API endpoints now can be done simply through a SQL Query to the data view. This is AWESOME. Sure, you could get most of this info elsewhere before, but it was not easily accessible as you would have to utilize API calls, whether internally or via a 3rd party service to get the information and even then, not all of this was within a single object or as easily digestible – meaning there was a lot of mapping of multiple calls and results to get this information.

These new Data Views provide all of that information with a fairly low skill level requirement to access. With some simple SQL skills, you can be building custom reports on your Automations and the Activities in those automations!

Let’s dive into a quick definition of each of these new data views:

_AutomationInstance

This data view is for providing historical information around each of an Automation’s runs, or Instances.

_AutomationActivityInstance

This data view provides historical information around each of the individual Activities inside of each Automation’s run, or Instance.

There are a ton of sites out there, like the official docs and Mateusz’s blog, for the nitty gritty details around the data views and their fields and even some simple example SQL queries – so I am not going to go into heavy detail on the fields and data types, etc. Plus with how well done the docs and Mateusz’s stuff is, I feel like it would be a downgrade in comparison 🙂

So now that we have the why for ‘why are they cool?’ and we have the basics of what each data view is…we next need to move into some considerations around using these to ensure you use them correctly and accurately.

Considerations

Below are the major considerations to keep in mind around these new data views:

24 hour delay before data is displayed

Prior to an automation appearing in either data view, there is a full 24 hours that must pass. This is not a real time data view (you will need to use the APIs for that). This delay is determined by the start date/time of the Instance, not the completion time – which is great as it means that if an automation takes like 3 hours, you don’t need to wait 27 hours to get it, just the 24 from the start of it.

This delay is for both _AutomationInstance and _AutomationActivityInstance. In _AutomationActivityInstance it is based on the start of the automation instance as well, not on the individual activities in it. So, if in current time it is past the 24 hour mark since the automation instance started, it will show this instance in _AutomationInstance, and it will also show each one of the corresponding Activities in that instance inside of _AutomationActivityInstance – regardless of when these activities started or completed.

30 Day Retention of Data

Both of the Automation data views have a retention period of 30 days. Meaning that the data will only exist in the data view from 24 hours after it was run until 30 days have passed from the start of the automation. This is a highly important consideration, especially with the majority of other Data Views have a 6 month retention policy.

It definitely makes sense that this is only a 30 day retention though, as this is a lot of information and a lot of processing to handle each and every automation run and individual activities, but as it is outside the norm – it is highly important to take note of. This makes the data a bit more transitory than in other areas, so if you want to retain this information beyond that timeline, you will need to build your own custom solution.

UTC Time Zone

Unlike all the other data views, the Automation data views are displayed in UTC time, not CST time. This means there is a 6 hour difference between data views. So for example an email sent from an automation may show a time of send at 2022-10-22 10:00:00 but inside the Automation data views it may show the auto starting at like 2022-10-22 16:00:00 as there is the 6 hour time difference. To ensure accurate matching and coordination between data sets, you will want to keep in mind the timezone and time difference as this is unique to only these data views.

No Enterprise view

The Automation data views are, similar to the Job data view, only for each individual Business Unit and do not offer an Enterprise wide view capability. This is greatly important as it means you would need to run queries in every business unit individually and combine them in order to get the whole Enterprise.

No Defined Primary Keys

There are some assumptions that can be made on what the primary keys should be in these data views and I would say on the majority that those assumptions are likely correct, but there is no official recommendation on what the primary keys of these Data Views are. This means you will need to carefully consider how you set up any related data extensions or archival DEs based on this Data View to ensure you correctly gather each record.

Now that we have the main ‘gotchas’ out of the way, let’s dive into the great possibilities that are opened up with these Data Views!

Use Cases

There are a TON of great use cases out there, even the official docs show a few thrown in there with example queries. The few I have chosen here are the ones I felt were the ‘coolest’ and would have the most impact or value in general.

Track Trigger Files From File Drop Automations

Now, this here is a GEM of a data point. Inside of the _AutomationInstance data view, there is a field named FilenameFromTrigger , which if you utilize file drop automations with imports and file transfers this likely is super familiar, which gives you the full name of the file that triggered that particular automation run to occur. This is a great point of data to log and keep so that if you ever run into issues with your file drop automation you can verify that when it ran, it ran using the right file, etc. which you could not really do before.

Example Query:

SELECT MemberID,
	AutomationName,
	AutomationInstanceID,
	FilenameFromTrigger,
	AutomationInstanceStartTime_UTC,
	AutomationInstanceEndTime_UTC,
	AutomationInstanceStatus,
	AutomationInstanceActivityErrorDetails
FROM [_AutomationInstance]
WHERE AutomationType = 'File Drop'

To be fair, there is also an AutomationType of ‘Triggered’ that you may want to include, but I have not yet found an automation in any of my accounts that return that type so I did not include it in my sample. You would just change the last line to WHERE AutomationType IN ('File Drop','Triggered')

RunTime Tracking

Who has not felt like at some point it took an automation WAAAAY longer to run than other times? Well, likely you are right! Now, we can dig into this information and make a custom report on it! You can use this info to review which runs of which automation took the longest, compare it to days/times/months/etc. to try and see when there were delays or extra processing draws and adjust accordingly to optimize. There is a ton of insight you can gain by doing runtime tracking.

Essentially you use some simple math and SQL functions to get the time elapsed from start to finish and store it in a DE. From there you can then query against that to find what was the longest run of an automation? What day of the week has the longest average run time? etc. Which can help you to better optimize and shape your most efficient version of automation.

Example Query:

SELECT MemberID,
	AutomationName,
	AutomationInstanceID,
	AutomationInstanceIsRunOnce,
	AutomationInstanceStartTime_UTC,
	AutomationInstanceEndTime_UTC,
	CONCAT(format((raw.runtime/3600),'00'),':',format(((raw.runtime/60) - ((raw.runtime/3600)*60)),'00'),':',format(raw.runTime - (((raw.runtime/60)*60) - ((raw.runtime/3600)*60)),'00')) as runTimeFormatted,
	raw.runTime,
	AutomationInstanceStatus,
	AutomationInstanceActivityErrorDetails    
FROM [_AutomationInstance]
OUTER APPLY (
   	SELECT DATEDIFF(SECOND,AutomationInstanceStartTime_UTC,AutomationInstanceEndTime_UTC) as runTime 
) raw

Now, you will notice that in this I have a more visually ‘readable’ way (runTimeFormatted) that we can use to easily find hrs/mins/seconds, but I also have a raw second count (runTime). This is so we have an easy reference to translate the seconds with the formatted view, but also have a raw field that is easier to sort/filter/query on.

Review Fluctuations in Activity Run Times

So, having the individual runs of an automation measured for elapsed time is super helpful, but to go even further in, why not also measure the fluctuations of run times of the activities inside those automations?! This way we can not only see what outside variables affect the automations overall, but we can hone in on how it affects specific activities. A great use here is to find where the run time of Queries or Scripts spike so you can know to move them to a different date/time or otherwise set up guardrails for it to help prevent time outs or other similar issues.

Example Query:

SELECT
	a.AutomationName,
	a.AutomationInstanceID,
	a.ActivityType,
	a.ActivityName,
	a.ActivityInstanceID,
	a.ActivityInstanceStartTime_UTC,
	a.ActivityInstanceEndTime_UTC,
	avg.RunTime as AvgRunTime,
	cur.RunTime as InstanceRunTime,
	(avg.RunTime - cur.RunTime) as Diff
FROM  [_automationactivityinstance] a
OUTER APPLY (
    SELECT TOP 1 Avg(Datediff(MINUTE, b.ActivityInstanceStartTime_UTC, b.ActivityInstanceEndTime_UTC)) AS RunTime
        FROM   [_automationactivityinstance] b
        WHERE  b.AutomationCustomerKey = a.AutomationCustomerKey
        AND b.ActivityCustomerKey = b.ActivityCustomerKey
        GROUP BY b.AutomationCustomerKey, b.ActivityCustomerKey
) avg
OUTER APPLY (
    SELECT TOP 1 Datediff(MINUTE, a.ActivityInstanceStartTime_UTC, a.ActivityInstanceEndTime_UTC) AS RunTime
) cur

By using the above, you will get the average run time of each activity inside each automation and then also that specific instances run time. This way you can then use simple math to find the difference between the two and see where and when the spikes are happening and adjust your strategy accordingly.

Review All RunOnce Automations

This one is more of an audit trail than a major report or otherwise fancy use, but is extremely useful. This helps you to find those that were run ‘adhoc’ via the Run Once button in Automation Studio. By taking the IsRunOnce out into a separate data extension, you allow a good view on adhoc runs in the account and lets you explore and review what work was being done where and/or if there were any issues or troubleshooting, etc. happening in those automations.

Example Query:

SELECT MemberID,
	AutomationName,
	AutomationInstanceID,
	AutomationInstanceIsRunOnce,
	AutomationInstanceStartTime_UTC,
	AutomationInstanceEndTime_UTC,
	AutomationInstanceStatus,
	AutomationInstanceActivityErrorDetails    
FROM [_AutomationInstance]
WHERE AutomationInstanceIsRunOnce = 1

Also, you can do this backwards and only grab those not ‘IsRunOnce’ and take these out into a separate data extension. If you remove RunOnce Automations from the scheduled ones, you can actually get more accurate and consistent measurements as it removes a bunch of variables by only using those that are programmatically initiated – which may be better averages and such for you to use in custom reports and measurements.

…And more!

There are loads more options here and a ton of things that are yet to be discovered or released, but this is the end of my article. I am so looking forward to the future and further releases by SFMC! What a great treasure! Looking forward to seeing what everyone else comes up with and all the great innovation from the community.

Subscribe
Notify of
guest
4 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Abdi Jibril
Abdi Jibril
1 year ago

Review Fluctuations in Activity Run Times” section SQL, looks like you have typo in Line# 17:
You have this: AND b.ActivityCustomerKey = b.ActivityCustomerKey
But should be:: AND b.ActivityCustomerKey = a.ActivityCustomerKey

Correct? Or, I’m missing something?

Last edited 1 year ago by Abdi Jibril
Abdi Jibril
Abdi Jibril
1 year ago

I’m must say first, Thank you for your service of posting this code but also explaining much more than Salesforce’s documentation