API / Data Extensions / SFMC / SSJS / WSProxy April 15, 2021
A common request I hear is the need for a quick reference place for the data extensions inside of a specific account as well as the details pertaining to them. Without the API, gathering this information can be a major pain in the butt.
My previous solution was great, but over time it has become inefficient and has run into some issues. This new updated version is very similar, but modernized to be more efficient and informative.
A quick note that this version is intended only to be used for a single BU and not to be used for enterprise-wide inventory. This whole script is based on WSProxy and Core functions, making it context dependent.
First thing to do is to create the appropriate data extension, for this case I named it DE_Inventory
. See below screenshot for field details:
This data extension will hold the full list of the data extensions you have inside the DE. You may notice I added a field named LU
. This field will be used for AMPscript lookups or similar that require conditions or critera to pull information. So you can just do LOOKUPROWS("DE_Inventory","LU", 1)
or similar to get all the rows.
After the data extension is created, we will move on to the SSJS Script itself. First I will focus on the two functions that we use inside this script:
ClearDE
This function will completely clear out the Inventory Data Extension prior to beginning the rest of the script. The reason I have it set up this way is to ensure those DEs that were deleted get removed from your Inventory Data Extension. It requires WSProxy in order to work – so you will need to make sure to declare the prox
variable prior to calling in the function.
function clearDE(custKey) { var action = "ClearData"; var props = { CustomerKey: custKey }; var data = prox.performItem("DataExtension", props, action); return data; }
Get All DE Info
This function will gather all the ‘meta’ data attached to each data extension in your business unit. This also uses WSProxy and requires the prox
variable to be declared before being called. This also has a while statement to account for if there is more than 2500 records. The only consideration around that would be the potential timeout if you have a significant number of DEs.
function getAllDeInfo() { var cols = [ "ObjectID" ,"CustomerKey" ,"Name" ,"CategoryID" ,"CreatedDate" ,"ModifiedDate" ,"Client.ID" ,"Description" ,"IsSendable" ,"SendableDataExtensionField.Name" ,"SendableSubscriberField.Name" ,"Template.CustomerKey" ,"CategoryID" ,"DataRetentionPeriodLength" ,"DataRetentionPeriodUnitOfMeasure" ,"RowBasedRetention" ,"ResetRetentionPeriodOnImport" ,"DeleteAtEndOfRetentionPeriod" ,"RetainUntil" ,"DataRetentionPeriod" ]; var reqID = null var moreData = true; var desc = []; while(moreData) { moreData = false; var data = reqID == null ? prox.retrieve("DataExtension", cols, filter): prox.getNextBatch("DataExtension", reqID); if(data != null) { moreData = data.HasMoreRows; reqID = data.RequestID; if(data && data.Results) { desc.push(data); } } } return desc; }
Folder Path
I took this almost line for line from Zuzanna’s blog post on finding data extensions and their folder path. It is actually quite fast and effective in grabbing the folder path for a data extension. It is not quite the ‘normal’ defined function like the others, but is super powerful and worth its own notation. It essentially loops through to grab each parent folder until it hits the top level folder.
var list = []; list.push(name); var path = function(id) { if (id> 0) { var results = Folder.Retrieve({Property:"ID",SimpleOperator:"equals",Value:id}); if(results) { list.unshift(results[0].Name); return path(results[0].ParentFolder.ID); } else { list = ['error in path retrieve'] return list; } } else { return id; } }; path(catID); folderPath = list.join("> ");
Core Function for DE Fields
This is a straight usage of the SFMC SSJS Core function to retrieve a DEs fields. It surprisingly runs faster than the WSProxy or other API alternatives. The goal of this is to grab a list of the fields as well as the associated meta data to be reviewed. My current return is just a string version of the Object, but you can add some more parsing or massaging to it to get a more ‘readable’ version for your needs.
var myDE = DataExtension.Init(customerkey) var fieldObj = myDE.Fields.Retrieve()
The Script
Basically what this script does is take the data gathered from the getAllDEInfo function and runs it through a couple for loops to hit every single row of data that contains a data extension and then set the meta data to SSJS variables. It then massages that data while also grabbing more unique things like folder path, rowcount and fields. From there it then takes all this data and upserts it to the DE Inventory Data Extension and moves on to the next record.
To help explain what is going on and how its being used and what each part does, I added in comments to the below script. This was an easier way to provide context than trying to reference and describe each aspect.
<script runat=server> Platform.Load("Core","1.1.1"); var prox = new Script.Util.WSProxy(); // Sets the Proxy - DO NOT CHANGE var custKey = "DE_Inventory"; //External Key of the DE that will hold all the data we gather var buID = Platform.Recipient.GetAttributeValue('memberid') //Sets the ID of current BU try { //Will attempt to run the below stuff var deArr = getAllDeInfo(); //returns all the DE info from this Business Unit var clear = clearDE(custKey) //Clears out old records to ensure deletions are removed from the existing records for (var a=0; a<deArr.length;a++) { //Iterate through the array holding each of the objects from the DE info return var deData = deArr[a].Results; //Set the Object for us to pull for (var e=0; e<deData.length; e++) { //iterate through the Results of that object, setting each value var name = deData[e].Name if(name.indexOf("QueryStudioResults") < 0) { //Condition to remove Query Studio DEs //Sets variables pulling in the meta data of the DEs var customerkey = deData[e].CustomerKey var desc = deData[e].Description var catID = deData[e].CategoryID var isSendable = deData[e].IsSendable var sendableDEField = deData[e].SendableDataExtensionField var sendableSubField = deData[e].SendableSubscriberField var deRetentionLength = deData[e].DataRetentionPeriodLength var deRetentionUnit = deData[e].DataRetentionPeriodUnitOfMeasure var deRowRetention = deData[e].RowBasedRetention var deResetRetention = deData[e].ResetRetentionPeriodOnImport var deDeleteRetention = deData[e].DeleteAtEndOfRetentionPeriod var deRetainUntil = deData[e].RetainUntil var deRetentionPeriod = deData[e].DataRetentionPeriod var sendableDEName = sendableDEField ? sendableDEField.Name : ''; var sendableSubName = sendableSubField ? sendableSubField.Name : ''; //Massages the data retention information into a more readable format var dataRetentionPolicyType; var dataRetentionPolicy = deRetentionLength + ' ' + deRetentionPeriod; if(deRowRetention) { dataRetentionPolicyType = "Individual Records" } else if(!deRowRetention && (Date.parse(deRetainUntil) > 0 && deDeleteRetention)) { dataRetentionPolicyType = "Remove All Data" } else if(!deRowRetention && (Date.parse(deRetainUntil) > 0 && !deDeleteRetention)) { dataRetentionPolicyType = "Remove All Data and DE" } else { dataRetentionPolicyType = "N/A" } //Call the Core function for getting field array var myDE = DataExtension.Init(customerkey) var fieldObj = myDE.Fields.Retrieve() //Call the AMPscript function to return DE rowcounts var rowCount = Platform.Function.TreatAsContent('%'+'%=DataExtensionRowCount("' + name + '")=%' + '%'); //Pulls the folder path of the data extension //Special thanks to Zuzanna for this from her article on SFMarketing.cloud var list = []; list.push(name); var path = function(id) { if (id> 0) { var results = Folder.Retrieve({Property:"ID",SimpleOperator:"equals",Value:id}); if(results) { list.unshift(results[0].Name); return path(results[0].ParentFolder.ID); } else { list = ['error in path retrieve'] return list; } } else { return id; } }; path(catID); folderPath = list.join("> "); var nameArr = ["Description","CategoryID","FolderPath","IsSendable","SendableDEField","SendableSubField","Fields","RowCount","RetentionPolicy","RetentionPolicyType"] // Array of Non Primary Key names to pass to Data Extension as Column Names var valueArr = [desc,catID,folderPath,isSendable,sendableDEName,sendableSubName,Stringify(fieldObj),rowCount,dataRetentionPolicy,dataRetentionPolicyType] //Array of Non Primary Key values to pass to Data Extension as Column Values if(deRetainUntil) { nameArr.push("RetainUntil"); valueArr.push(deRetainUntil); } //Pushes the Retain Until into Name/Value array if it exists. If not, then removes from upsert. This prevents the date being displayed as 1/1/1900 if null. var upsertDeInfo = Platform.Function.UpsertData(custKey, ["MID","Name", "CustomerKey"], [buID, name, customerkey], nameArr, valueArr) //Upserts the pulled value into the data extension, using the above primary keys } //end condition to remove Query Studio } // end Results iterative } // end Array of DE Objects iterative } catch(e) { // Instead of tossing an error or exception, this will display the error in an output Write(Stringify(e)) // Writes the error object to the page } function getAllDeInfo() { var cols = [ "ObjectID" //,"PartnerKey" ,"CustomerKey" ,"Name" ,"CategoryID" ,"CreatedDate" ,"ModifiedDate" ,"Client.ID" ,"Description" ,"IsSendable" ,"SendableDataExtensionField.Name" ,"SendableSubscriberField.Name" ,"Template.CustomerKey" ,"CategoryID" //,"Status" //,"IsPlatformObject" ,"DataRetentionPeriodLength" ,"DataRetentionPeriodUnitOfMeasure" ,"RowBasedRetention" ,"ResetRetentionPeriodOnImport" ,"DeleteAtEndOfRetentionPeriod" ,"RetainUntil" ,"DataRetentionPeriod" ]; var reqID = null var moreData = true; var desc = []; while(moreData) { moreData = false; var data = reqID == null ? prox.retrieve("DataExtension", cols, filter): prox.getNextBatch("DataExtension", reqID); //Write(Stringify(data) + '\r\n\r\n') if(data != null) { moreData = data.HasMoreRows; reqID = data.RequestID; if(data && data.Results) { desc.push(data); } } } return desc; } function clearDE(custKey) { var action = "ClearData"; var props = { CustomerKey: custKey }; var data = prox.performItem("DataExtension", props, action); return data; } </script>
Last time I provided a simple way to output this into a CloudPage for viewing, but I have found most people prefer it either in the data extension or to build that part out themselves, so I am leaving that out in this one. Feel free to go back to my old article to look for that info if you want though.
Speed
Because we all know that the number of DEs in a count can reach the level of absurdity, I decided to do a bit of a test on processing for this script. For the test, I did 60 record intervals (to easily match 1:1 on seconds) where I measured the run time, the Records run per second and the Seconds per record run.
I ran it in both a cloud page and in a script activity and there was a major difference in processing speed that needs to be considered when building your version.
Cloudpage Average (in seconds) | 46.11305 |
Script Average (in seconds): | 12.19822 |
This is a huge difference of over 30 seconds on average run time of the 60 records!
My recommendation is to definitely use a Script Activity for this, not just for the increase in timeout limits, but also it has a significant processing boost. Admittedly this is not a comprehensive test, but it gives the beginning of a guideline for making sure you remain inside of the timeline and avoid any timeouts, etc.
Using this average for script activities, I came to a general number of ‘max’ records this script can hit inside of a 30 minute period as 12,507. Now, the run time fluctuated quite a bit, so I would likely just trim it to a Script activity with a max at 10,000 to ensure you do not hit the timeout window.
Now this test is by no means comprehensive but is more meant to be a general guideline to help when implementing. I would highly recommend doing your own load testing and performance testing if you have concerns due to volume, etc.
GOLD CONTENT!
Your content is amazing, thanks for all of this.
Unfortunately I am having some trouble with this. I have copied everything with your and ran the code in cloudpages. I seem to be having a UpsertData Error:
An error occurred when attempting to execute an UpsertData function call. See inner exception for details.”,”description”
I ran into the same error, it’s worked for me when I changed the custKey to “Your Data Extension Name” in the upsertdata function in line 88
Refer to the documentation: https://developer.salesforce.com/docs/atlas.en-us.noversion.mc-programmatic-content.meta/mc-programmatic-content/ssjs_platformDataExtensionUpsertData.htm
Awesome Content. Getting the same error with UpsertData function. Even after the change from custKey to ‘DE_Inventory’
FInally it worked 🙂 but errored out after 5934 records ( 30 mins timeout). I’ll need to do some performance testing with volume/time. Thanks
Hello,
Thanks for providing this information , However I have a doubt I see that you have added the condition to remove the “QueryStudioResults” What if I wanted to exclude even DE’s for some other folder how do I pass the paramter ?
Example : if i do not want to see DE’s from QueryStudioResults and Test folders.
Thanks
You would just need to adjust the filter (if statement) accordingly to enforce your requested logic.
This is really a huge help to what I am working right now. Additional question tho is it possible to use this on a scenario to create a data extension where I can search through all data extension that are related to an email address or subscriber key? On the above example it only shows the fields/attributes of a data extension any suggestion if I want the data or record of a field?
To a degree the script can be used as a baseline to accomplish what you want. But it is not an easy shift and also with the higher processing and data load – it would require a lot more timeout protection and such.
This is great! Thank you very much! I have a question, what if we wanted to also pull the file size for each of these Data Extensions?
Very much appreciate this well thought out approach solving endless hours of searching amongst other things. The only thing that I could see as an improvement would be pulling in the CreatedDate and ModifiedDate. I have added the fields to the data extension along with putting the field names to update on line 79 but still for the life of me can not get that data to append correctly.
Hi, I have a problem with that script. I do not receive any error. but the Data Extension is not updated with the records. why this can happened, do you have any idea? I was trying everything, minimalize the columns for simple one only, using Add. row function instead of upsertData. Could you help? 🙁
Hello. I’m not sure if you’re still having problems, but I experienced the same thing when I initially set my inventory up. It turns out I missed including the Default Values for the three items in the DE that are to have default values (RetentionPolicy = N/A, RetentionPolicyType = N/A, LU = 1). Once I added those defaults, the script successfully ran.
Hi, thank you for this. I’m not experieced with script but I got it to work, however it times out as I’m dealing with a large number of data extensions. Consequently I have been advised to split the script up into smaller chunks but I really don’t know how to. Could you perhaps show how this code could be split up into smaller sections? Thanks
Hello! This inventory is great! I’m in the process of doing an audit and cleanup of our SFMC instance and this is awesome for looking up the location of the DEs from our many, many folders and sub-folders.
One questions/issue … It doesn’t appear to be capturing all my DEs. I’ve exported the results and am trying to search, but some DEs just flat out aren’t in there even though I see them in SFMC. I have taken out the query studio filter. Are the any other reasons why DEs would be excluded?
My query didn’t time out; it ran for only about 10 minutes and returned just under a thousand records, saying the activity was completed successfully in Automation Studio. I thought maybe it was missing duplicate named DEs since “name” is a primary key. I made name a non-primary key and moved “name” to the array of non-primary key names and values in the script. The script still ran successfully and returned the same number of records.
I know you said this is not for enterprise-wide inventory. I am trying to run it in multiple individual BUs. My target DE is NOT in a “Shared Data Extensions” folder. Am I missing anything? Thanks for your help!
Wow, I would have never thought of this
LOOKUPROWS(“DE_name”,”LU”, 1)
Hi Gortonington, I have a DE. Is there a way to find out where all this DE is being referenced/dependent. I need a list of automations/journeys/….. where all this DE is referenced. Can we achieve this?
Yes, this is possible, but it is a LOT of steps and a lot of API calls. Long story short, you would have to use API calls to search through every query, import, export, email send definition, etc. to see where that DE is used and then take those and search each automation/journey activity to see which utilize those activities and then push that back. It is possible, but a lot of work and a lot of processing.
Thank You
Thank you!!! This was very helpful. I have added CreatedDate and ModifiedDate in the DE hoping for this to populate but it doesn’t. Is their a way to pull this info in the DE.
Thanks,
M.I
I have very little experience on scripting, I took your script, updated in a Script activity, replaced the DE Name and ran the it.
I know for a fact I have over 1K DEs in my instance, the script only fetched information for 43 DEs, am I missing something?
Hi i ran the scripts in SFMC automation but nothing returned in the DE…
I would like to know if this solution is still valid when trying to replicate the script that runs with ok status in automation studio, but unfortunately the DE does not populate data, my BU has 02 MID’s I put only 01, could that be why?
thank you for the help