Most applications and big data platforms available today support multiple programming languages with in-built scripting editors to make writing scripts and queries easier, especially for analysts who do not have previous experience with programming languages. The type of programming language used in a big data analysis project would also depend on the type, source and structure of the raw data that needs to be processed.
In this topic, you will learn:
- key protocols and techniques commonly applied to programming languages when working with big data
- how to use DAX (Data Analysis Expressions) to perform big data analysis tasks in the Microsoft Power BI Desktop.
Let's begin.
DAX formulas are similar to those used in Excel tables. However, there are some key differences.
DAX formulas are often written as calculations and come in three forms.23
- Calculated measures – used to perform aggregations (e.g. calculate totals, count distinct values etc.)
- Calculated columns – used to add additional columns to tables in a dataset
- Calculated tables – used to create a new table with required columns that can have aggregated values.
Let us understand the basic rules that you need to follow when writing DAX formulas and specific object naming requirements. We need to consider the structure or format for a DAX formula, then review some commonly used function and how to build statements. Finally, these can be assembled to create a query to help make sense of big data and build your analysis.
Basic rules
When writing DAX formulas for calculated columns or measures it is important to follow these general rules or protocols.
- A DAX formula always starts with an equal (
=
) sign - After the equals sign, comes the expression, which may include elements such as statements, functions, operators and values. It is important to note that expressions:
- are always read from left to right
- use parentheses to control the order in which the elements are grouped.
- A DAX function always references a complete column, measure or table. This is commonly known as the fully qualified name of that object. For example:
- The fully qualified name of a column refers to the table name, followed by the column name in square brackets. For example,
'AUS Order'[Products]
. - The fully qualified name of a measure refers to the table name, followed by the measure name in square brackets. For example,
'AUS Order'[Profit]
.
- The fully qualified name of a column refers to the table name, followed by the column name in square brackets. For example,
Object naming requirements
- All object names are case-insensitive. (e.g. The table names
'SALES'
and'Sales'
would represent the same table name).- Note that the term object here refers to tables, columns and measures.
- The names of objects must be unique within their current context. For example:
- measure names must be unique within a model
- column names must be unique within a table
- table names must be unique within a database
- Leading or trailing spaces that are enclosed by name delimiters, brackets, or single apostrophes are valid in the names of tables, columns, and measures.
- If the name of a table contains any of the following, you must enclose the table name in single quotation marks.
- Spaces (e.g.
'Order Detail'
) - Reserved keywords (e.g.
'Analysis Services'
cannot be used as a table name unless it is enclosed in quotation marks) - Disallowed characters
.,;':/\*|?&%$!+=()[]{}<>
- Characters outside the ANSI alphanumeric character range
- Spaces (e.g.
Following are some examples of object names used within DAX formulas.
Example | Object type | Comments on the naming requirement |
---|---|---|
Customer |
Table name | This table name does not contain any spaces or other special characters. Therefore, the name does not need to be enclosed in quotation marks. |
'AUS Orders' |
Table name | This table name contains a space therefore it needs to be enclosed in single quotation marks. |
'Discount%' |
Table name | This table name contains a special character. Therefore it needs to be enclosed in single quotation marks. |
Orders[Cost] |
Fully qualified column name | The table name precedes the column name and the column name is enclosed in square brackets. |
Orders[Profit] |
Fully qualified measure name | The table name precedes the measure name and the measure name is enclosed in square brackets. |
[Cost] |
Unqualified column name |
This is just the column name in square brackets. This can be used in certain situations in formulas such as:
|
'AUS Order'[Cost] |
Fully qualified column in table with spaces | The table name contains spaces, so it must be enclosed in single quotation marks. |
Operators
There are different types of operators used when writing DAX formulas. Some of the frequently used DAX operators are listed in the following table.24
Operator type | Metacharacter symbols | Meaning |
---|---|---|
Arithmetic operators | + |
Addition |
- |
Subtraction | |
/ |
Division | |
* |
Multiplication | |
Comparison operators | = |
Equal to |
== |
Strict equal to | |
> |
Greater than | |
< |
Less than | |
<= |
Less than or equal to | |
>= |
Greater than or equal to | |
<> |
Not equal to | |
Text concatenation operator | & |
Connects two values to produce one continuous text value |
Logical operators | && |
AND condition |
|| |
OR condition | |
Parenthesis operator | () |
Used for precedence order and grouping of arguments |
For more information on DAX operators, refer to the DAX operators - DAX | Microsoft Docs.
Knowledge check
Complete the following seven (7) questions. Click the arrows to navigate between the questions.
Now that you are aware of the general rules for writing DAX formulas, let’s look into the structure and basic syntax of a DAX formula.
Basic elements of a DAX statement
Before you can write your own DAX functions you need to be familiar with the various elements that make up a DAX formula. 25
The following represents the basic elements that make up a DAX statement.
Simply, what this DAX statement does, is to create a measure named Total Costs
that calculates (=
) the SUM()
of values in the CostAmount
column in the Orders
table.
Let’s understand each of the elements in this DAX measure.
- The name of the new measure (e.g.
Total Costs
) - The DAX formula (e.g. begins with the equals sign operator and ends with close parenthesis)
- Marks the beginning of the formula is indicated by an equals sign operator (
=
) - DAX function (e.g.
SUM()
– used to add all values in a given table’s column such asOrders[CostAmount]
) - Parenthesis
()
, which surrounds an expression that contains one or more arguments. Most functions require at least one argument. An argument passes a value to a function. - The name of the table that the formula applies to (e.g.
Orders
) - The name of the column that the formula applies to (e.g.
CostAmount
)
The following examples demonstrate how the DAX formula syntax can be used to create a calculated column, measure and table. In each of these examples notice how the syntax rules are followed.
- DAX column/measure name (e.g.
Profit
,Total Profit
,Customer
) - DAX Formula
Adding comments in DAX scripts
Adding comments will not affect the performance of the scripts/queries but would make it easier for others to understand the written code and make changes later on if needed.
Following are the different methods of commenting.
In-line comments
Method: Start your comment with two hyphens/dashes (--
).
In the following DAX statement, notice the use of hyphens to embed the comment inline. This indicates that anything after the hyphens is the comment text.
Orders = SELECTCOLUMNS('Company Data', --this is an inline comment "Customer ID", [Customer ID], "Customer Name", [Customer Name], "Customer Address", [Customer Address] )
Single-line comments
Method: Begin the comment with the //
symbol.
In the following DAX statement, notice the use of the //
symbol to comment the entire line of text.
Orders = SELECTCOLUMNS('Company Data', "Customer ID", [Customer ID], // this is a single-line comment "Customer Name", [Customer Name], "Customer Address", [Customer Address] )
Commenting-out multiple lines of code
Method: Place the code in between /*
and */
symbols.
In the following DAX statement, the specific two lines of the script won’t execute because it is commented out.
Orders = SELECTCOLUMNS('Company Data', "Customer ID", [Customer ID] /* this is a multi-line comment "Customer Name", [Customer Name], "Customer Address", [Customer Address] */ )
Complete the following activity to check your understanding of the DAX formula syntax and commenting methods.
Knowledge check
Complete the following three (3) questions. Click the arrows to navigate between the questions.
DAX functions help to perform commonly used data calculations on data models, databases, tables etc.
Types of DAX functions
There are a variety of DAX functions that are built into the DAX language that can be used for various purposes when writing DAX formulas. It is important to note that each DAX function, has specific syntax rules that must be followed to ensure the script executes without any errors.
For more information on DAX functions and their specific syntax rules, refer to the DAX function reference - DAX | Microsoft Docs.
The following table explores and provides examples of some of the DAX functions, their purpose and how they can be used when writing queries.
Function type | DAX functions | What it does | Example |
---|---|---|---|
Aggregation functions |
SUM()
|
Used to add all the numbers that are in a column. |
The following example adds all the numbers in the =SUM(Sales[Cost]) |
SUMX()
|
Returns the sum of an expression evaluated for each row in a table. |
The following example shows the SUMX function taking the =SUMX(Sales, Sales[Unit Price] * Sales[Discount%] ) |
|
COUNT()
|
Counts the number of rows in the specified column that contain non-blank values. |
The following example returns a whole number after counting the values in the =COUNT(Product[Product ID]) |
|
DISTINCTCOUNT()
|
Used to return the number of distinct values in a given column. |
The following example returns the calculated number of distinct values from the =DISTINCTCOUNT(Category) |
|
AVERAGE()
|
Returns the average (arithmetic mean) of all the numbers in a column. |
The following example returns the average of the =AVERAGE(Sales[Revenue]) |
|
MAXA()
|
Returns the largest value in a column. |
The following example returns the largest value of the =MAXA(Sales[Revenue]) |
|
MINA()
|
Returns the smallest value in a column. |
The following example returns the minimum value of the =MINA(Sales[Revenue]) |
|
Math and trig functions |
DIVIDE()
|
Used to perform divisions and returns an alternate result or BLANK() on division by 0. Refer to: DIVIDE function vs divide operator (/) in DAX - DAX | Microsoft Learn to understand why this function is used instead of ‘/’ to perform divisions. |
The following example returns 5.
=DIVIDE(10,2)
The following example returns a blank.
=DIVIDE(10,0)
The following example returns 1.
=DIVIDE(10,0,1)
|
Statistical functions |
MEDIAN()
|
Returns the median of numbers in a column. |
The following example returns the median value of the =MEDIAN(Sales[Revenue]) |
STDEV.P()
|
Returns the standard deviation of the entire population. |
The following example returns the standard deviation value of the =STDEV.P(Sales[Revenue]) |
|
PERCENTILE.INC()
|
Returns the kth percentile of values in a range, where k is in the range 0-1 inclusive. Refer to PERCENTILE.INC function (DAX) - DAX | Microsoft Learn for more details. |
The following example returns the first percentile value of the =PERCENTILE.INC(Sales[Revenue,0.25]) The following example returns the third percentile value of the =PERCENTILE.INC(Sales[Revenue,0.75]) |
|
NORM.DIST()
|
Returns the normal distribution for the specified mean and standard deviation. Refer to NORM.DIST function (DAX) - DAX | Microsoft Learn for more details. |
Returns the normal distribution, for the X (32) value, having a mean of 30 and standard deviation of 1.5, for a cumulative distribution.
=NORM.DIST(32,30,1.5,TRUE)
|
Additional resources
Refer to the following sources to learn more about DAX syntax and how it can be used.
- DAX Guide
- DAX overview | Microsoft Learn
- Data Analysis Expressions (DAX) Reference - DAX | Microsoft Docs
- DAX Resource Center - TechNet Articles - United States (English) - TechNet Wiki (microsoft.com)
Knowledge check
Complete the following task.
Putting it all together
Your focus at this point should be on learning the syntax and the basic rules for using DAX statements.
Note: You will learn to use Power BI Desktop later in topic 6 of this module.
The following video demonstrates how the statements, functions, values and operators work together in DAX queries and the output it generates in Power BI Desktop.
In the following series of examples and activities, you will see how statements, functions, values and operators all work together in DAX queries.
Practical activity - Writing DAX statements
Refer to the data table and answer the following questions. For each question, attempt to write DAX queries yourself before looking at the solutions and answers provided.
Order ID | Product ID | Product Name | Unit Price | Unit Cost | Quantity | Sales |
---|---|---|---|---|---|---|
001 | 101002 | Colored Pencils | $ 3.50 | $ 1.00 | 5 | $17.50 |
001 | 101004 | Flourescent Highlighters | $ 5.00 | $ 2.50 | 2 | $10.00 |
002 | 101001 | Crayons | $ 7.50 | $ 3.00 | 1 | $7.50 |
002 | 101003 | Dustless Chalk Sticks | $ 2.50 | $ 1.50 | 2 | $5.00 |
002 | 101004 | Flourescent Highlighters | $ 5.00 | $ 2.50 | 1 | $5.00 |
003 | 101003 | Dustless Chalk Sticks | $ 2.50 | $ 1.50 | 10 | $25.00 |
003 | 101004 | Flourescent Highlighters | $ 5.00 | $ 2.50 | 10 | $50.00 |
Question 1
Create a calculated measure called 'Total Sales1'
that adds all the values in the Sales field for all records in the Orders table.
Total Sales1
= SUM( Orders[Sales] )Question 2
Create a calculated measure called 'Total Sales2'
that calculates the total sales value by multiplying the Unit Price and Quantity fields in the Orders table and adding all those values to get the total sales value as the final result.
Total Sales2
= SUMX( Orders, Orders[Unit Price] * Orders[Quantity])Question 3
Create a calculated column called 'Cost'
to calculate the total cost of each product sold in each row of the Orders table.
Cost
= Orders[Cost] * Orders[Quantity]Question 4
Create a calculated measure called 'Gross Profit'
that calculates the difference between the previously created measures Total Sales1
and Total Cost
.
Gross Profit
= Orders[Cost] * Orders[Quantity]Question 5
Create a calculated measure called 'Profit Margin'
that calculates the division of the 'Gross Profit'
value and ensures that a value of '0' is indicated if the returned value equals '0'.
Profit Margin
= DIVIDE ( Orders[Cost] * Orders[Quantity] )Topic summary
Congratulations on completing your learning for this topic Programming protocols and techniques.
In this topic you learnt the following.
- Introduction to Data Analysis Expressions (DAX)
- DAX formula syntax
- DAX functions
- Writing DAX statements
Check your learning
The final activity for this topic is a set of questions that will help you prepare for your formal assessment.
Knowledge check
Complete the following three (3) questions. Click the arrows to navigate between the questions.
Assessments
Now that you have completed the basic knowledge required for this module, you are ready to complete the following assessment event:
- Assessment 2 (Online Quiz)
What’s next?
Next, we will dive into how to identify business requirements relating to using big data for operational decision-making.