Breakout sessions 2023

Friday, June 9th

You can experience the following sessions. Delivered by international experts from both Microsoft and the field.

- Bogdan Crivat
Keynote: Synapse SQL for Microsoft Fabric

Learn how Microsoft completely rebuilt Synapse SQL. Moving from having both the Serverless and Dedicated Pools to a single engine.

Get a behind the scenes look at the architecture with showcases on scalability, performance, vertipaq/v-order and much more.

Auditorium A

Fri 09.15 – 10.00
- Data analytics
- Dedicated SQL Pools
- Fabric
- Serverless SQL Pool
- Synapse Analytics
- Intermediate
Session presentation

SquirrelWitchWardrobe.pdf
- James Hutton
Driving alerts and actions on your data in Microsoft Fabric

Your data is only valuable if you can act on it. But acting on data often requires manual monitoring of reports and dashboards, which can be time-consuming. That’s why we have created Data Activator. Data Activator is an experience in Microsoft Fabric that lets you create alerts and trigger Power Automate flows from your data, without writing code. In this session I will demo Data Activator so that you can see it in action.

Auditorium A

Fri 10.15 – 11.00
- Data analytics
- Fabric
- Introductory and overview
- Stijn Wynants
- Filip Popović
Synapse Espresso Lungo

Fabric Espresso team is coming to your favorite conference with freshly brewed pot of content!

Join us in this hour-long session to learn how to unlock the power of Microsoft Fabric and make your analytics efforts a breeze.

We will cover all the Fabric components and everything you need to know to delight your business users in no time! And you will know how to squeeze the best performance out of Fabric!

Regardless of whether you are into lakehouses or warehouses, we got you covered. Even better, why not both! We will cover best strategy for ingestion and consumption of your data so you can start working on your analytic solution quickly and with confidence.

Every tip and trick we share is proven in real world customer scenarios.

And all of this comes in easy to consume sips!

Room C

Fri 10.15 – 11.00
- Data Warehouse
- Serverless SQL Pool
- Spark Notebooks
- Synapse Analytics
- Intermediate
Session presentation

FabricInAnHour.pdf
- Heini Ilmarinen
Azure Synapse Analytics: Networking for Production

Azure Synapse Analytics is a wonderful collection of capabilities to bring your data to life, but making sense of how to secure the networtking for Synapse Analytics can be challenging. We need to understand concepts like managed vnets and private endpoints, as well as know about Azure DNS, network architecture and possibly even about connectivity methods from on-premises to make this work with more complex scenarios.

Join this session to understand networking for Azure Synapse Analytics and how you can find the correct level of network security for your scenario. We will start from an overview of Azure Synapse Analytics and understanding the different endpoints that make up the capabilities of Synapse Analytics. Then we will go from the most open networking configuration to securing everything to only traffic from your own networks. With this we will finish with a walk through of the details of a common scenario of having secure access to your Synapse Analytics workspace from an on-premises site.

Room D

Fri 10.15 – 11.00
- Data Lakehouse
- Dedicated SQL Pools
- Serverless SQL Pool
- Spark Notebooks
- Synapse Analytics
- Intermediate
- Mark Pryce-Maher
- Strahinja Rodic
- Stefan Azaric
Microsoft Fabric – Introduction & Overview

On May 23, 3023, Microsoft unveiled its latest breakthrough – Microsoft Fabric, heralding a new era for the data world. But what is Fabric, and why is it critical for those engaged in Microsoft’s data platform? This 45-minute session will delve into the heart of Fabric, elucidating its key components and its transformative potential.

In this session, we’ll explore the driving forces behind Fabric, its role in the modern data landscape, and why it’s an essential tool for any Microsoft data platform professional. We’ll demystify the intricacies of Fabric, focusing on the aspects most crucial for you to understand, and equip you with the knowledge to harness its power in your work.

We will cover:
* An overview of Microsoft Fabric
* The importance of Fabric in data management: Its unique capabilities and the problems it can solve (OneLake, Copilot…)
* Getting started with Microsoft Fabric
* Hands-on demo

Whether you’re a seasoned data professional or a newcomer, this session will provide a comprehensive introduction to Fabric, positioning you to take full advantage of this revolutionary technology. Join us as we unravel the threads of Microsoft Fabric and step into the future of data management.

Auditorium A

Fri 11.15 – 12.00
- Data analytics
- Data Integration Pipelines
- Dedicated SQL Pools
- Delta Lake
- Fabric
- GitHub
- Serverless SQL Pool
- Spark Notebooks
- Synapse Analytics
- Synapse Link
- Stream Analytics
- Introductory and overview
Session presentation

Fabric – Data Platform Next Step.pdf
- Koos van Strien
Azure DevOps for Data Engineers: the Git way

If you want to setup Azure DevOps the way that works best with Git, what would it be?

Usually, “One repository per independently deployable product” is stated. What is one product? As Data Platform people, many of us are used to thinking of our (former?) Data Warehouse as a black box. But that’s not what meant here!

We’ll take inspiration from our software development peers and look how to break up our data platform projects in multiple smaller repositories and accompanying pipelines. Also, we take a close look at developments like Data Mesh and the product placement of Fabric to find out how we can support “Data Products” with Azure DevOps

We’ll look into the practice of breaking up your data project into multiple small repositories and accompanying pipelines. How does that work with schema changes? And how do we orchestrate everything in a neat way? Are there other best practices to keep in mind?

Room C

Fri 11.15 – 12.00
- DevOps & DataOps
- Azure Databricks
- Azure DevOps
- Azure SQL Database
- Data Factory
- Data Integration Pipelines
- Delta Lake
- Fabric
- Synapse Analytics
- Intermediate
Session presentation

Slidedeck workshop.pdf
- Craig Porteous
Designing Data Architectures that InfoSec will actually approve

Building your data platform in the cloud is easy, but as soon as that dreaded word “security” becomes involved it suddenly becomes incredibly painful. How do you go about integrating it with your existing networking, how do you manage user security, what on earth is a private endpoint? Over the past year, a lot of these tools have evolved and we now have a set of mature patterns we can apply to actually make a modern data platform secure.

In this session I’ll guide you from through a secure reference architecture with Data Factory, Databricks, Data Lake, and Azure Synapse, working together as a secure, fully productionised platform. Each has their own idiosyncrasies, but this session will teach you the options available and the pitfalls to avoid.

Room D

Fri 11.15 – 12.00
- Data Lakehouse
- Azure Databricks
- Data Factory
- Synapse Analytics
- Intermediate
Session presentation

Designing Data Architectures InfoSec will approve – DPNS.pdf
- Mark Pryce-Maher
Microsoft Fabric Q&A

You have seen all the Fabric marketing? Now is your chance to ask all the technical questions and get them answered directly from members of the Microsoft Product Group.

Auditorium A

Fri 12.00 – 12.30
- Data analytics
- Fabric
- Intermediate
- Sam Debruyn
dbt & Fabric: better together

dbt is the new data transformation tool taking the world by storm. It lowers the barrier of entry into the world of data analytics to everyone who ever wrote a line of SQL. Did you know it integrates quite well with all Microsoft SQL products and even with Fabric? Join this session to follow in the footsteps of thousands of analytics engineers and fall in love with dbt. Learn more about how dbt works with Fabric and Azure SQL from the maintainer of the official dbt adapter! We’ll use Fabric and VS Code to build our first Hello Fabric project.

Auditorium A

Fri 13.00 – 13.45
- Data Warehouse
- Azure SQL Database
- dbt Labs
- Dedicated SQL Pools
- Fabric
- SQL Server
- Synapse Analytics
- Introductory and overview
Session presentation

Data Platform Next Step dbt Synapse Fabric.pdf
- Falek Miah
Spark Execution Plans for Databricks

Databricks is a powerful data analytics tool for data science and data engineering, but understanding how code is executed on cluster can be daunting.

Using Spark execution plans allows you to understand the execution process and flow, this is great for optimizing queries and identifying bottlenecks.

This session will introduce you to Spark execution plans, the execution flows and how to interrogate the different plans.

By the end of this session, you will have everything you need to start optimizing your queries.

Room C

Fri 13.00 – 13.45
- Data Lakehouse
- Azure Databricks
- Spark Notebooks
- Synapse Analytics
- Intermediate
Session presentation

202306 – DataPlatformNextSteps – Spark Execution Plans.pdf
- Pieter Vanhove
Protect your data from tampering with Ledger in SQL

Establishing trust around the integrity of data stored in database systems has been a longstanding problem for all organizations that manage financial, medical, or other sensitive data. Ledger is a new feature in Azure SQL and SQL Server that incorporates blockchain crypto technologies into the RDBMS to ensure the data stored in a database is tamper evident. The feature introduces ledger tables that make data tampering easy to detect.

In this session, we will cover
• The basic concepts of Ledger and how it works
• Ledger Tables
• Digest management and database verification

After completing this session, you will understand how SQL Ledger works and how it can help to protect sensitive data.

Room D

Fri 13.00 – 13.45
- Data governance
- Azure SQL Database
- SQL Server
- Other
- Intermediate
Session presentation

Bringing the Power of Blockchain to SQL with Ledger – Data Platform Next Step – No Video.pptx
- Simon Whiteley
A Deep Dive in DeltaLake Performance

If you’re using Spark in a production environment, you’ve likely already discovered Delta, one of the leading next-generation file formats. But is that all there is to it? How do you manage your Delta tables to make them FAST?

This session takes the question of performance and scale with Delta. We’ll recap Delta itself, before looking at partitioning approaches and file pruning. Then we’ll manipulate statistics and optimize our files to squeeze even more performance out. We’ll use Z-Order and understand the change to our file layout and resulting performance. We’ll finish by comparing Databricks and Microsoft Fabric, where we can pit Bloom filter indexes against the new V-Order optimisations!

Auditorium A

Fri 14.00 – 14.45
- Data Lakehouse
- Azure Databricks
- Delta Lake
- Fabric
- Advanced
Session presentation

Delta Performance DeepDive.pdf
- Kamil Nowinski
Azure Data Factory – Deployment challenges

ADF is an important brick in the architecture of any modern data warehousing solution and many other scenarios.
As it exists for some time now and we know its capability pretty well, the deployment of the service is still something that leaves much to be desired, specifically in a bit more complex instances.
In this session, I will show a few challenges to publishing ADF and solution for them.

Room C

Fri 14.00 – 14.45
- DevOps & DataOps
- Azure DevOps
- Data Factory
- Intermediate
Session presentation

DPNS_NowinskiK_ADF_DevOps.pdf
- Erwin de Kreuk
Solve your Data Governance challenges with Microsoft Purview

What data do I have? Where did the data come from? Can I trust it? How do I manage access and control?
These are questions that a Chief Data Officer wants to have answers on when analyzing an organization’s Data Estate.

Data consumer, data producers and the security administrator all have their own challenges. Microsoft Purview is designed to address these challenges.

Microsoft Purview will help to understand assets across the entire data estate and provide easy access to all data, security and risk solutions.

In this session, we’ll take a closer look at Unified Data Governance, one of Microsoft Purview’s solutions and see if we have answers on the followings questions:

· What challenges do organizations and user groups face with Data Governance?
· How can Microsoft Purview contribute to this?
· How can we easily create a holistic, up-to-date map of our data landscape?
· How can we find valuable and reliable data?
· What are the costs for Microsoft Purview?
· What are the latest/new features available in Microsoft Purview

So if you’re a CDO, a data consumer, a data producer, or a security administrator, these sessions are definitely worth following.

Room D

Fri 14.00 – 14.45
- Data governance
- Purview
- Intermediate
Session presentation

2023-06-09DP Next Step SQL Days Data Governance with Microsoft Purview Erwin De Kreuk.pdf
- Paul Andrew
Implementing Azure Data Integration Pipelines in Production

Within a typical Azure data platform solution for any enterprise grade data analytics or data science workload an umbrella resource is needed to trigger, monitor, and handle the control flow for transforming datasets. Those requirements are met by deploying Azure Data Integration pipelines, delivered using Synapse Analytics or Data Factory. In this session I’ll show you how to create rich dynamic data pipelines and apply these orchestration resources in production. Using scaled architecture design patterns, best practice and the latest metadata driven frameworks. In this session we will take a deeper dive into the service, considering how to build custom activities, dynamic pipelines and think about hierarchical design patterns for enterprise grade deployments. All this and more in a series of short stories (based on real world experience) I will take you through how to implement data integration pipelines in production.

Auditorium A

Fri 15.00 – 15.45
- Data integration
- Data Factory
- Data Integration Pipelines
- Synapse Analytics
- Advanced
Session presentation

Implementing Azure Data Integration Pipelines in Production.pdf
- Kennie Nybo Pontoppidan
Kennie, Log Analytics, and Facts – What time is love? (live from Billund)

For the last 4 years, I have worked extensively with telemetry on the Dynamics 365 Business Central product. All of this telemetry is stored in Azure Log Analytics, a big data store that can handle A LOT OF data. In this session I will share my love for this technology, giving you an overview of Log Analytics and KQL, including KQL vs. SQL. I will show you use cases from Azure Monitor / Azure Application Insights as use cases of Log Analytics. I will discuss cost (and cost control strategies) of Log Analytics. And finally, I will talk a little bit about what this means in Real Time Analytics in Microsoft Fabric.

Room C

Fri 15.00 – 15.45
- Streaming data
- Azure Data Explorer
- Advanced
- Torben Søndergaard
- Morten Høybye Frederiksen
Using Synapse to combine data from D365FO with other data for reporting

Showing how to bring data from the Dynamics 365 FO into Synapse Analytics. A showcase based on real experience with the Common Data Model (CDM) and how to setup the dataflow using the functionapp CDMUtil and Synapse Analytics (dedicated SQL Pool) to automate the extract of data using Synapse pipelines.

D365 FO is loaded with data, but combining these data with other data from eg. fabric sensors and bringing those data into decisions through the Power BI has been a struggle. But Microsoft has introduced the Extract to Datalake tool in D365, where selected data is copied to the data lake in a CDM “formate”. To automatically interpret the files in the data lake and load those data into a Synapse Analytics SQL Pool, we use the CDMUtil functionapp and a Synapse Pipeline where we can join with other data from the business, and utilize the power of the Synapse engine and SQL operations, to add value to the final reporting.

At the end of the session the audience will have a good understanding of how to bring data from the Dynamics 365FO into Synapse Analytics for further processing.

Room D

Fri 15.00 – 15.45
- Data Lakehouse
- Serverless SQL Pool
- Synapse Analytics
- Intermediate
Session presentation

Dataplatform Next.pdf
- Benni De Jagere
- Stijn Wynants
Using Lakehouse Data at scale with Power BI, featuring Direct Lake mode

Many companies have invested heavily in building data lakes to store large volumes of structured and unstructured data from various sources into Delta Parquet files. These Delta Parquet files can be used for a wide range of Analytics and Business Intelligence applications. Most of these organizations struggle to derive insights from their investments due to the complexity of accessing and querying the data, and how to let self-service users connect to this data in the lake using Power BI.

With the introduction of Microsoft Fabric, an all-in-one analytics solution for enterprises, we now have a better approach for this. In this session, we will explore how to use Lakehouse data at scale with Power BI, using the new Direct Lake connectivity mode. Power BI Direct Lake combines the best of both worlds from Import and DirectQuery mode, and gives us the option for great performance over data in the lake, without introducing additional latency for dataset refreshes.

We will start by discussing the benefits of the Lakehouse architecture and how it can improve data management and analytics. We will then move on to explore how to connect to Lakehouse data using Power BI by combining both of these architecture components and using each of them to their strengths.

We will also cover best practices for optimizing performance when working with large volumes of data, including using data partitioning and query optimization techniques. We will demonstrate how to use Power BI to analyze Lakehouse data in real-time and how to build reports that provide actionable insights for decision-making.

By the end of the session, attendees will have a solid understanding of how to leverage Lakehouse data at scale with Power BI and how to build powerful analytics solutions that can handle massive amounts of data. Whether you are a data analyst, data scientist, or BI professional, this session will provide you with valuable insights into the world of Lakehouse data and Power BI, featuring the new Direct Lake connectivity mode.

Auditorium A

Fri 16.00 – 16.45
- Data analytics
- Fabric
- Serverless SQL Pool
- Synapse Link
- Intermediate
Session presentation

LakehouseDataAtScalewithPowerBI.pdf
- Simon Whiteley
Building a Data Sharing Lakehouse with Unity Catalog

You’ve written some Pyspark, loaded data into a lake and built some lovely data models…now what? How do you open that data up to your analytics community? How do you build a secure, but easy to use platform?

With Databricks, we now have a governance platform for securing, documenting & presenting data to many different use cases. In this session we’ll dive into how this changes our Lakehouse patterns, how to get started with Unity Catalog and show some of the new features!

Some familiarity with lakes & Databricks will help!

Room C

Fri 16.00 – 16.45
- Data Lakehouse
- Azure Databricks
- Other
- Intermediate
- Christian Henrik Reich
Modelling and indexing your data warehouse

At some point, when working with data, the star schema pops up. There is a lot of misconception about the star schema, but realising it is designed for our data technologies and our data technologies is optimised for it, it becomes a very powerfull pattern. This session is deep technical and about designing star schemas and indexing them correctly, and how it roots in our technologies. The end result is that attendee can build a data warehouse for less money, and having a self-service platform like Power BI holding more data compared to using other patterns.

Room D

Fri 16.00 – 16.45
- Data Warehouse
- Azure SQL Database
- Dedicated SQL Pools
- SQL Server
- Other
- Advanced
Session presentation

Modelling and optimize your data warehouse(DPNS).pdf

Breakout sessions 2023

Friday, June 9th

Keynote: Synapse SQL for Microsoft Fabric

Driving alerts and actions on your data in Microsoft Fabric

Synapse Espresso Lungo

Azure Synapse Analytics: Networking for Production

Microsoft Fabric – Introduction & Overview

Azure DevOps for Data Engineers: the Git way

Designing Data Architectures that InfoSec will actually approve

Microsoft Fabric Q&A

dbt & Fabric: better together

Spark Execution Plans for Databricks

Protect your data from tampering with Ledger in SQL

A Deep Dive in DeltaLake Performance

Azure Data Factory – Deployment challenges

Solve your Data Governance challenges with Microsoft Purview

Implementing Azure Data Integration Pipelines in Production

Kennie, Log Analytics, and Facts – What time is love? (live from Billund)

Using Synapse to combine data from D365FO with other data for reporting

Using Lakehouse Data at scale with Power BI, featuring Direct Lake mode

Building a Data Sharing Lakehouse with Unity Catalog

Modelling and indexing your data warehouse