It’s been a while that I am working with OData data source in Power BI. One challenge that I almost always do not have a good understanding of the underlying data model. It can be really hard and time consuming if there is no one in the business that understands the underlying data model. I know, we can use $metadata
to get the metadata schema from the OData feed, but let’s not go there. I am not an OData expert but here is the thing for someone like me, I work with various data sources which I am not necessarily an expert in, but I need to understand what the entities are, how they are connected etc… then what if I do not have access any SMEs (Subject Matter Expert) who can help me with that?
So getting involved with more OData options, let’s get into it.
The custom function below accepts an OData URL then it discovers all tables, their column count, their row count (more on this later), number and list of related tables, number and list of columns of type text
, type number
and Decimal.Type
.
// fnODataFeedAnalyser
(ODataFeed as text) =>
let
Source = OData.Feed(ODataFeed),
SourceToTable = Table.RenameColumns(
Table.DemoteHeaders(Table.FromValue(Source)),
{{"Column1", "Name"}, {"Column2", "Data"}}
),
FilterTables = Table.SelectRows(
SourceToTable,
each Type.Is(Value.Type([Data]), Table.Type) = true
),
SchemaAdded = Table.AddColumn(FilterTables, "Schema", each Table.Schema([Data])),
TableColumnCountAdded = Table.AddColumn(
SchemaAdded,
"Table Column Count",
each Table.ColumnCount([Data]),
Int64.Type
),
TableCountRowsAdded = Table.AddColumn(
TableColumnCountAdded,
"Table Row Count",
each Table.RowCount([Data]),
Int64.Type
),
NumberOfRelatedTablesAdded = Table.AddColumn(
TableCountRowsAdded,
"Number of Related Tables",
each List.Count(Table.ColumnsOfType([Data], {Table.Type}))
),
ListOfRelatedTables = Table.AddColumn(
NumberOfRelatedTablesAdded,
"List of Related Tables",
each
if [Number of Related Tables] = 0 then
null
else
Table.ColumnsOfType([Data], {Table.Type}),
List.Type
),
NumberOfTextColumnsAdded = Table.AddColumn(
ListOfRelatedTables,
"Number of Text Columns",
each List.Count(Table.SelectRows([Schema], each Text.Contains([Kind], "text"))[Name]),
Int64.Type
),
ListOfTextColunmsAdded = Table.AddColumn(
NumberOfTextColumnsAdded,
"List of Text Columns",
each
if [Number of Text Columns] = 0 then
null
else
Table.SelectRows([Schema], each Text.Contains([Kind], "text"))[Name]
),
NumberOfNumericColumnsAdded = Table.AddColumn(
ListOfTextColunmsAdded,
"Number of Numeric Columns",
each List.Count(Table.SelectRows([Schema], each Text.Contains([Kind], "number"))[Name]),
Int64.Type
),
ListOfNumericColunmsAdded = Table.AddColumn(
NumberOfNumericColumnsAdded,
"List of Numeric Columns",
each
if [Number of Numeric Columns] = 0 then
null
else
Table.SelectRows([Schema], each Text.Contains([Kind], "number"))[Name]
),
NumberOfDecimalColumnsAdded = Table.AddColumn(
ListOfNumericColunmsAdded,
"Number of Decimal Columns",
each List.Count(
Table.SelectRows([Schema], each Text.Contains([TypeName], "Decimal.Type"))[Name]
),
Int64.Type
),
ListOfDcimalColunmsAdded = Table.AddColumn(
NumberOfDecimalColumnsAdded,
"List of Decimal Columns",
each
if [Number of Decimal Columns] = 0 then
null
else
Table.SelectRows([Schema], each Text.Contains([TypeName], "Decimal.Type"))[Name]
),
#"Removed Other Columns" = Table.SelectColumns(
ListOfDcimalColunmsAdded,
{
"Name",
"Table Column Count",
"Table Row Count",
"Number of Related Tables",
"List of Related Tables",
"Number of Text Columns",
"List of Text Columns",
"Number of Numeric Columns",
"List of Numeric Columns",
"Number of Decimal Columns",
"List of Decimal Columns"
}
)
in
#"Removed Other Columns"
Here is the GitHub link for the above code.
I used this function for initial investigation on various OData sources including Microsoft Project Online, Microsoft Business Central, some third party tools and of course Northwind sample. While it works fine in all of the mentioned data sources, for some data sources like Business Central it is not quite helpful. So be mindful of that.
I used Power Query formatter to format the above code. I just polished it a bit to fit it to my taste. Give it a go, it’s a good tool.
As mentioned earlier, the above function shows tables’ column count as well as their row count. On the latter, the row count, I would like to raise a point. If the underlying table has a lot of columns then the row count calculation may take a long time.
The screenshot below shows the results of the fnODataFeedAnalyser
function invoked for a Microsoft Project Online and it took a wee bit less than 3 minutes to run.
Have you used this method before to analyse a dataset that you are not familiar with the structure? Do have a better idea? Please share your thoughts in the comments section below.
Oh! and… by the way, feel free to change the above code and make it better. Just do not forget to share the improved version with the community.