A common request I hear is the need for a quick reference place for the data extensions inside of a specific account as well as the details pertaining to them. Without the API, gathering this information can be a major pain in the butt.

My previous solution was great, but over time it has become inefficient and has run into some issues. This new updated version is very similar, but modernized to be more efficient and informative.

A quick note that this version is intended only to be used for a single BU and not to be used for enterprise-wide inventory. This whole script is based on WSProxy and Core functions, making it context dependent.

First thing to do is to create the appropriate data extension, for this case I named it DE_Inventory. See below screenshot for field details:

This data extension will hold the full list of the data extensions you have inside the DE. You may notice I added a field named LU. This field will be used for AMPscript lookups or similar that require conditions or critera to pull information. So you can just do LOOKUPROWS("DE_Inventory","LU", 1) or similar to get all the rows.

After the data extension is created, we will move on to the SSJS Script itself. First I will focus on the two functions that we use inside this script:

ClearDE

This function will completely clear out the Inventory Data Extension prior to beginning the rest of the script. The reason I have it set up this way is to ensure those DEs that were deleted get removed from your Inventory Data Extension. It requires WSProxy in order to work – so you will need to make sure to declare the prox variable prior to calling in the function.

function clearDE(custKey) {
  var action = "ClearData";
  var props = {
          CustomerKey: custKey
  };
  var data = prox.performItem("DataExtension", props, action);

  return data;
}

Get All DE Info

This function will gather all the ‘meta’ data attached to each data extension in your business unit. This also uses WSProxy and requires the prox variable to be declared before being called. This also has a while statement to account for if there is more than 2500 records. The only consideration around that would be the potential timeout if you have a significant number of DEs.

function getAllDeInfo() {
  var cols = [
    "ObjectID"
    ,"CustomerKey"
    ,"Name"
    ,"CategoryID"
    ,"CreatedDate"
    ,"ModifiedDate"
    ,"Client.ID"
    ,"Description"
    ,"IsSendable"
    ,"SendableDataExtensionField.Name"
    ,"SendableSubscriberField.Name"
    ,"Template.CustomerKey"
    ,"CategoryID"
    ,"DataRetentionPeriodLength"
    ,"DataRetentionPeriodUnitOfMeasure"
    ,"RowBasedRetention"
    ,"ResetRetentionPeriodOnImport"
    ,"DeleteAtEndOfRetentionPeriod"
    ,"RetainUntil"
    ,"DataRetentionPeriod"
  ];

  var reqID = null
  var moreData = true;
  var desc = [];

  while(moreData) {
    moreData = false;
    var data = reqID == null ?
      prox.retrieve("DataExtension", cols, filter):
      prox.getNextBatch("DataExtension", reqID);
    if(data != null) {
      moreData = data.HasMoreRows;
      reqID = data.RequestID;
      if(data && data.Results) {
        desc.push(data);
      }
    }
  }

  return desc;
}

Folder Path

I took this almost line for line from Zuzanna’s blog post on finding data extensions and their folder path. It is actually quite fast and effective in grabbing the folder path for a data extension. It is not quite the ‘normal’ defined function like the others, but is super powerful and worth its own notation. It essentially loops through to grab each parent folder until it hits the top level folder.

var list = [];
  list.push(name);
var path = function(id) {
    if (id> 0) {
        var results = Folder.Retrieve({Property:"ID",SimpleOperator:"equals",Value:id});

        if(results) {
          list.unshift(results[0].Name);
          return path(results[0].ParentFolder.ID);
        } else {
          list = ['error in path retrieve']
          return list;
        }
    } else {
        return id;
    }
};
path(catID);
folderPath = list.join("> ");

Core Function for DE Fields

This is a straight usage of the SFMC SSJS Core function to retrieve a DEs fields. It surprisingly runs faster than the WSProxy or other API alternatives. The goal of this is to grab a list of the fields as well as the associated meta data to be reviewed. My current return is just a string version of the Object, but you can add some more parsing or massaging to it to get a more ‘readable’ version for your needs.

var myDE = DataExtension.Init(customerkey)
var fieldObj = myDE.Fields.Retrieve()

The Script

Basically what this script does is take the data gathered from the getAllDEInfo function and runs it through a couple for loops to hit every single row of data that contains a data extension and then set the meta data to SSJS variables. It then massages that data while also grabbing more unique things like folder path, rowcount and fields. From there it then takes all this data and upserts it to the DE Inventory Data Extension and moves on to the next record.

To help explain what is going on and how its being used and what each part does, I added in comments to the below script. This was an easier way to provide context than trying to reference and describe each aspect.

<script runat=server>
Platform.Load("Core","1.1.1");

var prox = new Script.Util.WSProxy(); // Sets the Proxy - DO NOT CHANGE
var custKey = "DE_Inventory"; //External Key of the DE that will hold all the data we gather
var buID = Platform.Recipient.GetAttributeValue('memberid') //Sets the ID of current BU

try { //Will attempt to run the below stuff
  var deArr = getAllDeInfo();  //returns all the DE info from this Business Unit 
  var clear = clearDE(custKey)  //Clears out old records to ensure deletions are removed from the existing records

  for (var a=0; a<deArr.length;a++) { //Iterate through the array holding each of the objects from the DE info return
    var deData = deArr[a].Results;  //Set the Object for us to pull

    for (var e=0; e<deData.length; e++) { //iterate through the Results of that object, setting each value
      var name = deData[e].Name

      if(name.indexOf("QueryStudioResults") < 0) {  //Condition to remove Query Studio DEs
        //Sets variables pulling in the meta data of the DEs
        var customerkey = deData[e].CustomerKey
        var desc = deData[e].Description
        var catID = deData[e].CategoryID
        var isSendable = deData[e].IsSendable
        var sendableDEField = deData[e].SendableDataExtensionField
        var sendableSubField = deData[e].SendableSubscriberField
        var deRetentionLength = deData[e].DataRetentionPeriodLength
        var deRetentionUnit = deData[e].DataRetentionPeriodUnitOfMeasure
        var deRowRetention = deData[e].RowBasedRetention
        var deResetRetention = deData[e].ResetRetentionPeriodOnImport
        var deDeleteRetention = deData[e].DeleteAtEndOfRetentionPeriod
        var deRetainUntil = deData[e].RetainUntil
        var deRetentionPeriod = deData[e].DataRetentionPeriod
        var sendableDEName = sendableDEField ? sendableDEField.Name : '';
        var sendableSubName = sendableSubField ? sendableSubField.Name : '';
  
        //Massages the data retention information into a more readable format
        var dataRetentionPolicyType;
        var dataRetentionPolicy = deRetentionLength + ' ' + deRetentionPeriod;

        if(deRowRetention) {
            dataRetentionPolicyType = "Individual Records"
        } else if(!deRowRetention && (Date.parse(deRetainUntil) > 0 && deDeleteRetention)) {
            dataRetentionPolicyType = "Remove All Data"
        } else if(!deRowRetention && (Date.parse(deRetainUntil) > 0 && !deDeleteRetention)) {
            dataRetentionPolicyType = "Remove All Data and DE"
        } else {
            dataRetentionPolicyType = "N/A"
        }

        //Call the Core function for getting field array
        var myDE = DataExtension.Init(customerkey)
        var fieldObj = myDE.Fields.Retrieve()

        //Call the AMPscript function to return DE rowcounts 
        var rowCount = Platform.Function.TreatAsContent('%'+'%=DataExtensionRowCount("' + name + '")=%' + '%');

        //Pulls the folder path of the data extension
        //Special thanks to Zuzanna for this from her article on SFMarketing.cloud
        var list = [];
          list.push(name);
        var path = function(id) {
            if (id> 0) {
                var results = Folder.Retrieve({Property:"ID",SimpleOperator:"equals",Value:id});

                if(results) {
                  list.unshift(results[0].Name);
                  return path(results[0].ParentFolder.ID);
                } else {
                  list = ['error in path retrieve']
                  return list;
                }
            } else {
                return id;
            }
        };
        path(catID);
        folderPath = list.join("> ");
                                      
        var nameArr = ["Description","CategoryID","FolderPath","IsSendable","SendableDEField","SendableSubField","Fields","RowCount","RetentionPolicy","RetentionPolicyType"]
        // Array of Non Primary Key names to pass to Data Extension as Column Names

        var valueArr = [desc,catID,folderPath,isSendable,sendableDEName,sendableSubName,Stringify(fieldObj),rowCount,dataRetentionPolicy,dataRetentionPolicyType]
        //Array of Non Primary Key values to pass to Data Extension as Column Values

        if(deRetainUntil) { nameArr.push("RetainUntil"); valueArr.push(deRetainUntil); }
          //Pushes the Retain Until into Name/Value array if it exists. If not, then removes from upsert. This prevents the date being displayed as 1/1/1900 if null.

        var upsertDeInfo = Platform.Function.UpsertData(custKey, ["MID","Name", "CustomerKey"], [buID, name, customerkey], nameArr, valueArr)
        //Upserts the pulled value into the data extension, using the above primary keys

      } //end condition to remove Query Studio
    } // end Results iterative
  } // end Array of DE Objects iterative

} catch(e) { // Instead of tossing an error or exception, this will display the error in an output

 Write(Stringify(e)) // Writes the error object to the page

}

function getAllDeInfo() {
  var cols = [
    "ObjectID"
    //,"PartnerKey"
    ,"CustomerKey"
    ,"Name"
    ,"CategoryID"
    ,"CreatedDate"
    ,"ModifiedDate"
    ,"Client.ID"
    ,"Description"
    ,"IsSendable"
    ,"SendableDataExtensionField.Name"
    ,"SendableSubscriberField.Name"
    ,"Template.CustomerKey"
    ,"CategoryID"
    //,"Status"
    //,"IsPlatformObject"
    ,"DataRetentionPeriodLength"
    ,"DataRetentionPeriodUnitOfMeasure"
    ,"RowBasedRetention"
    ,"ResetRetentionPeriodOnImport"
    ,"DeleteAtEndOfRetentionPeriod"
    ,"RetainUntil"
    ,"DataRetentionPeriod"
  ];

  var reqID = null
  var moreData = true;
  var desc = [];

  while(moreData) {
    moreData = false;
    var data = reqID == null ?
      prox.retrieve("DataExtension", cols, filter):
      prox.getNextBatch("DataExtension", reqID);
    //Write(Stringify(data) + '\r\n\r\n')
    if(data != null) {
      moreData = data.HasMoreRows;
      reqID = data.RequestID;
      if(data && data.Results) {
        desc.push(data);
      }
    }
  }

  return desc;
}

function clearDE(custKey) {
  var action = "ClearData";
  var props = {
          CustomerKey: custKey
  };
  var data = prox.performItem("DataExtension", props, action);

  return data;
}
</script>

Last time I provided a simple way to output this into a CloudPage for viewing, but I have found most people prefer it either in the data extension or to build that part out themselves, so I am leaving that out in this one. Feel free to go back to my old article to look for that info if you want though.

Speed

Because we all know that the number of DEs in a count can reach the level of absurdity, I decided to do a bit of a test on processing for this script. For the test, I did 60 record intervals (to easily match 1:1 on seconds) where I measured the run time, the Records run per second and the Seconds per record run.

I ran it in both a cloud page and in a script activity and there was a major difference in processing speed that needs to be considered when building your version.

Cloudpage Average (in seconds)46.11305
Script Average (in seconds):12.19822

This is a huge difference of over 30 seconds on average run time of the 60 records!

My recommendation is to definitely use a Script Activity for this, not just for the increase in timeout limits, but also it has a significant processing boost. Admittedly this is not a comprehensive test, but it gives the beginning of a guideline for making sure you remain inside of the timeline and avoid any timeouts, etc.

Using this average for script activities, I came to a general number of ‘max’ records this script can hit inside of a 30 minute period as 12,507. Now, the run time fluctuated quite a bit, so I would likely just trim it to a Script activity with a max at 10,000 to ensure you do not hit the timeout window.

Now this test is by no means comprehensive but is more meant to be a general guideline to help when implementing. I would highly recommend doing your own load testing and performance testing if you have concerns due to volume, etc.

Tags: , , , , , , , , ,
Subscribe
Notify of
guest
23 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Ali Eroglu
Ali Eroglu
3 years ago

GOLD CONTENT!

Last edited 3 years ago by Ali Eroglu
Chris
Chris
3 years ago

Your content is amazing, thanks for all of this.
Unfortunately I am having some trouble with this. I have copied everything with your and ran the code in cloudpages. I seem to be having a UpsertData Error:
An error occurred when attempting to execute an UpsertData function call. See inner exception for details.”,”description”

Christina
Christina
Reply to  Chris
3 years ago

I ran into the same error, it’s worked for me when I changed the custKey to “Your Data Extension Name” in the upsertdata function in line 88

Refer to the documentation: https://developer.salesforce.com/docs/atlas.en-us.noversion.mc-programmatic-content.meta/mc-programmatic-content/ssjs_platformDataExtensionUpsertData.htm

Chris
Chris
Reply to  Chris
3 years ago

Awesome Content. Getting the same error with UpsertData function. Even after the change from custKey to ‘DE_Inventory’

Last edited 3 years ago by Chris
Manoj Chris
Manoj Chris
Reply to  Chris
3 years ago

FInally it worked 🙂 but errored out after 5934 records ( 30 mins timeout). I’ll need to do some performance testing with volume/time. Thanks

Omkar Kagati
Omkar Kagati
3 years ago

Hello,

Thanks for providing this information , However I have a doubt I see that you have added the condition to remove the “QueryStudioResults” What if I wanted to exclude even DE’s for some other folder how do I pass the paramter ?

Example : if i do not want to see DE’s from QueryStudioResults and Test folders.

Thanks

Andrea
Andrea
3 years ago

This is really a huge help to what I am working right now. Additional question tho is it possible to use this on a scenario to create a data extension where I can search through all data extension that are related to an email address or subscriber key? On the above example it only shows the fields/attributes of a data extension any suggestion if I want the data or record of a field?

Sebastian
Sebastian
2 years ago

This is great! Thank you very much! I have a question, what if we wanted to also pull the file size for each of these Data Extensions?

Matt
Matt
2 years ago

Very much appreciate this well thought out approach solving endless hours of searching amongst other things. The only thing that I could see as an improvement would be pulling in the CreatedDate and ModifiedDate. I have added the fields to the data extension along with putting the field names to update on line 79 but still for the life of me can not get that data to append correctly.

Alicja
Alicja
2 years ago

Hi, I have a problem with that script. I do not receive any error. but the Data Extension is not updated with the records. why this can happened, do you have any idea? I was trying everything, minimalize the columns for simple one only, using Add. row function instead of upsertData. Could you help? 🙁

Jill
Jill
Reply to  Alicja
2 years ago

Hello. I’m not sure if you’re still having problems, but I experienced the same thing when I initially set my inventory up. It turns out I missed including the Default Values for the three items in the DE that are to have default values (RetentionPolicy = N/A, RetentionPolicyType = N/A, LU = 1). Once I added those defaults, the script successfully ran.

Peter I
Peter I
2 years ago

Hi, thank you for this. I’m not experieced with script but I got it to work, however it times out as I’m dealing with a large number of data extensions. Consequently I have been advised to split the script up into smaller chunks but I really don’t know how to. Could you perhaps show how this code could be split up into smaller sections? Thanks

Jill
Jill
2 years ago

Hello! This inventory is great! I’m in the process of doing an audit and cleanup of our SFMC instance and this is awesome for looking up the location of the DEs from our many, many folders and sub-folders.

One questions/issue … It doesn’t appear to be capturing all my DEs. I’ve exported the results and am trying to search, but some DEs just flat out aren’t in there even though I see them in SFMC. I have taken out the query studio filter. Are the any other reasons why DEs would be excluded?

My query didn’t time out; it ran for only about 10 minutes and returned just under a thousand records, saying the activity was completed successfully in Automation Studio. I thought maybe it was missing duplicate named DEs since “name” is a primary key. I made name a non-primary key and moved “name” to the array of non-primary key names and values in the script. The script still ran successfully and returned the same number of records.

I know you said this is not for enterprise-wide inventory. I am trying to run it in multiple individual BUs. My target DE is NOT in a “Shared Data Extensions” folder. Am I missing anything? Thanks for your help!

JM Janzen
JM Janzen
2 years ago

Wow, I would have never thought of this
LOOKUPROWS(“DE_name”,”LU”, 1)

Rajesh Kommineni
Rajesh Kommineni
2 years ago

Hi Gortonington, I have a DE. Is there a way to find out where all this DE is being referenced/dependent. I need a list of automations/journeys/….. where all this DE is referenced. Can we achieve this?

Rajesh Kommineni
Rajesh Kommineni
Reply to  Gortonington
2 years ago

Thank You

Mike
Mike
1 year ago

Thank you!!! This was very helpful. I have added CreatedDate and ModifiedDate in the DE hoping for this to populate but it doesn’t. Is their a way to pull this info in the DE.

Thanks,

M.I

Pratik
Pratik
11 months ago

I have very little experience on scripting, I took your script, updated in a Script activity, replaced the DE Name and ran the it.
I know for a fact I have over 1K DEs in my instance, the script only fetched information for 43 DEs, am I missing something?

Jennifer
Jennifer
11 months ago

Hi i ran the scripts in SFMC automation but nothing returned in the DE…

Robson Martins
Robson Martins
11 months ago

I would like to know if this solution is still valid when trying to replicate the script that runs with ok status in automation studio, but unfortunately the DE does not populate data, my BU has 02 MID’s I put only 01, could that be why?
thank you for the help