Finding Minimum Date and Maximum Date Across All Tables in Power Query in Power BI and Excel

When we talk about data analysis in Power BI, creating a Date table is inevitable. There are different methods to create a Date table either in DAX or in Power Query. In DAX you my use either CALENDAR() function or CALENDARAUTO() function to create the Date table. In Power Query you may use a combination of List.Dates()#date() and #duration() functions. Either way, there is one point that is always challenging and it is how to find out a proper date range, starting from a date in the past and ending with a date in the future, that covers all relevant dates within the data model. One simple answer is, we can ask the business. The SMEs know what the valid date range is..

While this is a correct argument it is not always the case. Especially with the Start Date which is a date in the past. In many cases the business says:

Lets’s have a look at the data to find out.

That is also a correct point, we can always a look at the data, find all columns with either Date or DateTime datatypes then sort the data in ascending or descending order to get the results. But what if there many of them? Then this process can be very time consuming.

Many of you may already thought that we can use CALENDARAUTO() in DAX and we are good to go. Well, that’s not quite right. In many cases there are some Date or DateTime columns that must not be considered in our Date dimension. Like Birth Date or Deceased Date. More on this later in this post.

In this post I share a piece of code I wrote for myself. I was in a situation to identify the Start Date and the End Date of the date dimension many times, so I thought it might help you as well.

How it works?

The Power Query expressions I share in this post starts with getting all existing queries using:

  • #sections intrinsic variable
  • Filtering out the current query name, which is GetMinMaxAllDates in my sample, to avoid getting the following error:

Expression.Error: A cyclic reference was encountered during evaluation.

  • Filtering out the queries that are NOT as type table
  • Adding a new structured column named TableSchema that includes the tables’ structure
  • Expanding the TableSchema structured column keeping the Name and Kind columns and renaming the Name column to Column Name and the Kind column to Datatype
  • Filter the results to keep only the columns with either Date or DateTime datatypes
  • Filtering out unnecessary values from the Column Name like Birth Date
  • Adding a new column named Min Date that gets the minimum value of the column that appears in the Column Name column of the table value that appears in the Value column

Hmm! I suppose it is too much mentioning valuecolumn and table in different contexts. I hope I’m not making it even more confusing though.

  • Adding another new column named Max Date similar to how we created the Min Date
  • Extracting the minimum value of the Min Date column
  • Extracting the maximum values of the Max Date column
  • Showing the latter two as a list

So if you are looking for a solution here is the Power Query expressions that I use:

let
    AllQueries = #sections,
    RecordToTable = Record.ToTable(AllQueries[Section1]),
    FilterOutCurrentQuery = Table.SelectRows(RecordToTable, each [Name] <> "GetMinMaxAllDates" and Type.Is(Value.Type([Value]), type table) = true),
    AddTableSchemaColumn = Table.AddColumn(FilterOutCurrentQuery, "TableSchema", each try Table.Schema([Value]) otherwise null),
    ExpandTableSchema = Table.Buffer(Table.ExpandTableColumn(AddTableSchemaColumn, "TableSchema", {"Name", "Kind"}, {"Column Name", "Datatype"})),
    FilterTypes = Table.SelectRows(ExpandTableSchema, each ([Datatype] = "datetime" or [Datatype] = "date")),
    AddedMinDateColumn = Table.AddColumn(FilterTypes, "Min Date", each Date.From(List.Min(Table.Column([Value], [Column Name])))),
    AddedMaxDateColumn = Table.AddColumn(AddedMinDateColumn, "Max Date", each Date.From(List.Max(Table.Column([Value], [Column Name])))),
    FilterOutUnnecessaryColumns = Table.SelectRows(AddedMaxDateColumn, each ([Column Name] <> "BirthDate")),
    MinDate = List.Min(List.Combine({FilterOutUnnecessaryColumns[Min Date], FilterOutUnnecessaryColumns[Max Date]})),
    MaxDate = List.Max(List.Combine({FilterOutUnnecessaryColumns[Min Date], FilterOutUnnecessaryColumns[Max Date]})),
    MinMaxDates = {"Min Date = " & Text.From(MinDate), "Max Date = " & Text.From(MaxDate)}
in
        MinMaxDates

You can download the above expressions from here.

The image below illustrates the results of running the above code in Power Query Editor having 11 fact tables and 2 dimension tables. Those tables have 17 columns with either Date or DateTime datatypes:

GetMinMaxAllDates Query in Power Query

Note: Once again, you need to pass the current query name in the expressions above. In my case the current query name is GetMinMaxAllDates as shown in the image below:

Earlier in this post I mentioned that in many cases we do NOT want all Date or DateTime columns to be covered by the Date table. A good example for it is Birth Date and Deceased Date. If we do not note that then we can create a lot of irrelevant dates in our Date table like what we get as the Min Date in the above image which is 10/02/1916. As you can in the image above there is a FilterOutUnnecessaryColumns step. We click on that step to filter the unnecessary values from the Column Name column as shown in the image below:

Click on the last step which is MinMaxDates to see the new values as shown in the image below:

By running the above query you get the valid date range, so you can now create a Date table with any method of choice, either in Power Query or DAX using the above date range. Remember, creating the Date table is completely separate process. This query is only helping us finding minimum and maximum valid dates across all tables loaded into the Power Query Editor.

Please note that this method is only to get the min and max valid dates across all tables. If you attempt to load the results into the data model in Power BI, the resulting table will be empty. This behaviour is the result of using #sections or #shared intrinsic variables in Power Query and their inherent limitations. While #sections is useful for accessing metadata about all queries, its dynamic nature can cause problems when trying to load data into the Power BI model. Therefore, we have to Disable query load to avoid getting an empty table in the data model.

Considerations

  • The above tables altogether have 40M rows and the GetMinMaxAllDates query ran in approximately 10 sec on my machine which is not bad at all. However, in larger tables it may take more to give you the results
  • You must have some queries already loaded into the Power BI Editor
  • This method also works in Direct Query mode, but you expect the query to take more time to get the results
  • The above query retrieves the min date and max date across all tables. When you create a Date table, be aware that the Date column should start from the 1st Jan of the min date going all the way up to the 31st Dec of the max date
  • This method works in Power Query Editor within Power BI Desktop RS as well
  • This method is NOT supported in Power BI Dataflows

Enjoy your Dating!