Breakout sessions

Friday, June 14th

You can experience the following sessions. Delivered by international experts from both Microsoft and the field.

- Falek Miah
Quest to Delta Optimisation

Delta has become a widely used tool by data professionals to build effective and reliable Lakehouse’s. Yet, questions arise regarding its performance with large datasets, its ability to handle skewed data, and its concurrent write management. In this session, will dive deep into optimization options and methods that will improve your Lakehouse performance.

Delta files are not ordinary data files but are key in making Lakehouse efficient, optimal, and scalable. However, optimizing delta files and tables in Databricks can be challenging and even a daunting task. Techniques like partitioning and z-ordering can be limited, inflexible, and challenging to implement, especially when your data is constantly changing or growing.
This session will introduce you to the new liquid clustering technique, a cutting-edge approach that is more flexible and adaptable to data layout changes. This will not only enhance your query performance but also simplify your optimization process.
Furthermore, we will explore various other Delta file optimization techniques, such as data skipping, z-ordering, and vacuuming in Databricks. These techniques will help you maximize the value of your Delta files while minimizing resource utilization and costs.
By the end of this session, you’ll have the necessary knowledge and tools to optimize Delta files and tables for your own Lakehouse.

Strand (130)

Fri 10:15 am – 11:00 am
- Data Lakehouse
- Databricks
- Intermediate
Session presentation

Quest to Delta Optimisation.pdf
- Paul Andrew
- Devang Shah
Microsoft Fabric Real-time Intelligence – Ingest, Analyse & Act at Speed

In this session we will roll our sleeves up and get hands on, demonstrating how we can handle real-time data feeds in Microsoft Fabric for a popular cycling & adventure shop! Ingesting data and having it flow into the Real-time Intelligence Hub, then downstream to a variety of different Experience features, including other Fabric Item triggering. Building a connected solution in Microsoft Fabric.

Festival (190)

Fri 10:15 am – 11:00 am
- Streaming data
- Fabric Data Activator
- Real-Time Analytics
- Real-Time Intelligence
- Intermediate
- Laura de Bruin
Authorization By Data Classification (ABC)

Configuring the complex landscape of data security is like mastering a complex dance.

Data access control is essential for ensuring the security of sensitive information. However, the choreography of managing data privacy and security is a complex routine. Many rules need to be adhered to when it comes to data sharing. Employees are not allowed to see all data, and different rules apply to each employee or department. These rules are often manually applied within a system in a decentralized manner, leading to maintenance challenges, errors, and labor-intensive moves.

By using data classifications, you could develop a centralized, user-friendly solution to effectively tackle this complexity, turning the data security dance into a well-coordinated tango. The security of sensitive information can be ensured by classifying data into categories such as financial, personal, and referential data. The classification can be done at the column, table, schema, or system level. Access controls are then applied to restrict data access to authorized users only.
By adopting a metadata-driven approach, the ABC method simplifies entity management, facilitating efficient data governance. Once implemented, the path to data access is simplified to managing metadata, waving goodbye to the intricacies of individual configurations.

Authorization by classification can be extended to various resources, including Databricks, SQL databases, and Power BI.

During this session, we’ll guide you through the dance, providing you with the tools to orchestrate this security symphony in your own data platform.

Theater (90)

Fri 10:15 am – 11:00 am
- Data analytics
- Azure SQL Database
- Databricks
- Fabric Data Engineering
- Fabric Data Warehouse
- Power BI
- Spark Notebooks
- Intermediate
- Ust Oldfield
DataWatch: Seeing the Unseen in Data Platforms and Products through Observability

In this session, we delve into the pivotal role of observability in the realm of data platforms and data products. Observability, a concept transcending mere monitoring, offers a comprehensive view of data systems, enabling stakeholders to understand not only what is happening within their platforms but also why it’s happening. This session will explore how observability practices can effectively monitor usage and activity, thus providing valuable insights for adoption strategies and governance models.

We will discuss practical approaches to implement observability in data platforms and products, focusing on techniques to capture and analyze metrics for a data platform – focusing on Power BI, Fabric and Databricks.

Attendees will gain an understanding of how observability can be a driving force in fostering user adoption and adherence to governance policies, as well as how to practically apply observability to their own platform. Observe early. Observe often.

The session is designed for data professionals, platform managers, and governance teams, aiming to harness the power of observability for making informed decisions, optimizing resource allocation, and ensuring compliance. Through real-world examples and interactive discussions, participants will leave with a comprehensive toolkit to elevate their data platform and product management strategies.

Strand (130)

Fri 11:15 am – 12:00 pm
- Data analytics
- Databricks
- Fabric Data Engineering
- Fabric Data Warehouse
- Power BI
- Intermediate
Session presentation

Data Watch – Slides.pdf
- Jacob Rønnow Jensen
Putting Fabric to the test

In AP Pension, we are running a PoC (to be completed by February 2024) to test the following: Is it possible to build a meta data driven and modern version of Data agnostic ingestion engine in Fabric to handle all our known ingestion types (snapshot, delta, streaming, …), the encryption of PII-data at load time and a best practice framework with templates for data modelling for dealing with slowly changing dimensions, asynchronous updates and bi-temporal timelines in data.

More about the PoC on https://www.linkedin.com/pulse/putting-fabric-test-jacob-r%C3%B8nnow-jensen-42evf/

Festival (190)

Fri 11:15 am – 12:00 pm
- Data Lakehouse
- Databricks
- Delta Lake
- Fabric Data Engineering
- Fabric Data Warehouse
- Purview
- Intermediate
Session presentation

Data Platform Next Step, AP Pension.pdf
- Tomasz Kostyrka
Azure Data Platform as Code – can it be done it in less than a week?

Azure Data Platform as Code – can it be done it in less than a week?

At the beginning of 2023, an idea came to mind: let’s check if we can prepare the entire data platform as code and install it straight out of the box on a fresh and clean Azure environment.

Life verified this idea, but the initial failures allowed us to quickly draw conclusions and redefine our goal. We’ve decided to explore whether we can prepare tools that will enable us to accelerate the implementation of the platform and have it set up in less than a week in the client’s environment. And here, to our own surprise, we managed to do it.

During the session, I will try to talk about the most important things we have learned throughout this year. About failures, dead ends, drawn conclusions, and the approach we ultimately developed.

During this one-hour presentation, we will address the following topics:
– Why even build a data platform in the first place?
– What are Landing Zones, CAF, Cloud Scale Analytics Framework, and why should the ‘Data people’ also understand it?
– How to deal with topics outside the data domain, such as networking, security, monitoring?
– Some good practices related to managing Azure Data Factory and Azure Databricks at scale.
– Which tools were used to automate it?
– Modularity!
– Why good analysis and collaboration with the client are the key to success.

Theater (90)

Fri 11:15 am – 12:00 pm
- Data Lakehouse
- Azure DevOps
- Azure SQL Database
- Databricks
- Data Factory
- Infrastructure as Code
- Advanced
Session presentation

20240614_NextSteps_DataPlatformAsCode.pptx
- Johan Ludvig Brattås
Make Fabric the Gold layer of your multi-cloud medallion architecture!

While Microsoft Fabric is a full-blown data platform in its own right, you might allready have invested in a modern data platform allready.
You still use Power BI of course – because that is the best tool out there.
That might come with a cost – such as slower reports or egress costs. Or you might miss out on some new functionality in your current solution that Fabric promises to deliver.

Microsoft has got you covered!

Come join me to see how Fabric can be positioned as your gold layer – or part of the gold layer in your data platform.
We will look at data mirroring, shortcuts, and all the nice little tricks available to get the best of both worlds: Your existing data platform and Microsoft Fabric!

Key take-aways:
Learn how Fabric can integrate with solutions such as Databricks, SQL dwh or Snowflake to give you the best of both worlds.
Learn how data mirroring works in the different systems.
Understand how shortcuts and OneLake APIs can help you.

Strand (130)

Fri 1:00 pm – 1:45 pm
- Data Lakehouse
- Databricks
- Fabric Data Engineering
- Fabric Data Warehouse
- Other
- Intermediate
Session presentation

Fabric – the Gold Layer.pdf
- Benni De Jagere
Understanding Fabric Capacities

You’ve heard about Microsoft Fabric, and you’re ready to take it for a spin? Excellent, let’s get us started off in those few advertised minutes! But hold on .. you need a capacity to actually use something, and might not be completely clear on what it actually entails? You’re not alone with these questions, and it is perfectly fine to stop and think about it for a while. In fact, it’s a good thing you want to understand the single most core concept of Fabric as that will hopefully allow you to make better decisions down the road.

The introduction of Fabric Capacities sparked a lot of questions with Data Architects, Engineers, and Analysts coming from an IaaS or Paas (Infrastructure or Platform as a Service) way of working. Microsoft Fabric is presented as an all-in-one Analytics SaaS (Software as a Service) solution, with a unified measure for Compute and Storage. Great, promising to make the cost and performance predictability a lot simpler. Great! But what exactly does that mean, and what will it actually cost the company?

To understand Fabric Capacities, we need to briefly look at the architecture and what exactly those unified measures look like, including how they are similar, yet different from the existing Power BI Premium Capacities. Understanding the different types and sizes of capacities will help us make the right decisions for our Data Platform solutions in the organization.

But then, how do you manage those capacities and assess if they are in a healthy state? What are some of the options to follow the demands and needs of your business users to allocate the right resources to them? Most importantly, what options do I have to automate the majority of these tasks?

Walking out of the session, you should understand the key concept of Fabric Capacities and how they are at the core of everything you’ll do in Microsoft Fabric, be able to choose the one that is right for you, periodically assess if the choice was right, and act where needed.

Festival (190)

Fri 1:00 pm – 1:45 pm
- Data governance
- Power BI
- Other
- Advanced
Session presentation

Understanding Fabric Capacities.pdf
- Sally Dabbah
Mastering the Art: Orchestrating ADF with the Power of Managed Airflow

During this session, my proposal revolves around orchestrating the orchestrator itself by dynamically invoking pipelines in ADF through Apache Airflow, at the moment its not supported in ADF so using airflow solved the problem.

Theater (90)

Fri 1:00 pm – 1:45 pm
- Data analytics
- Data Factory
- Advanced
Session presentation

Managed Airflow in ADF – Copy.pptx
- Štěpán Rešl
Lessons learnt from PySpark Notebooks and extracting APIs

Notebooks inside Fabric are a high-speed solution for data transformation, and APIs are a great data source and are provided by many systems. So, it is great to have the ability to download and shape their data by PySpark for other Fabric purposes.

In this session, I will show my lessons learnt from Notebooks and API extractions. A session is aimed even for beginners who have never worked with APIs or Notebooks. I will show what libraries are required and how to create a User Define Function that can be very handy for more complex transformations.

Strand (130)

Fri 2:00 pm – 2:45 pm
- Data integration
- Fabric Data Engineering
- Spark Notebooks
- Advanced
Session presentation

ntb_DataPlatformNextStep.zip
- Filip Popović
- Artur Vieira
Data Warehouse Performance Best Practices

Microsoft Fabric Data Warehouse makes no compromise when it comes to the performance of your workloads! We invite you to learn about performance best practices and explore the inner workings and advanced features that empower Microsoft Fabric to deliver outstanding performance across a diverse range of workloads.

Festival (190)

Fri 2:00 pm – 2:45 pm
- Data Warehouse
- Fabric Data Warehouse
- Expert
Session presentation

Fabric Warehouse performance best practices.pdf
- Alexander Arvidsson
Stand Fast – How Governance Can Save You a Fortune

Why is it that cloud projects often end up way more expensive than they should? The answer might surprise you: a lack of governance.

It is easy to get stuck on the technical aspects of a cloud project and hence try to find equally technical ways of saving money. There is no such thing as a free lunch, and seldom is that clearer than in an environment where you pay for every second of compute consumed. Making a query run faster, optimizing a pipeline, or storing a bit less in a data lake, can decrease your overall consumption costs, but there might be even bigger savings lurking just out of view.

The solution for saving the big bucks is spelled “governance”. Most people see governance as something tedious, boring and ultimately futile, as users will inevitably find a way around the governance limitations and romp all over the environment. But if we look at governance as a tool for enablement, a way to make work easier for the end user and the administrator alike, as an opportunity optimize spend – then the whole dynamic shifts. Wouldn’t it be amazing if we could combine useful guardrails, increased productivity AND optimized cloud spend? Let me show you how!

This session will look at unexpected drivers of cost, discuss ways to surface (and mitigate) some of them, and try to convince you why governance is really the best way to save money in the long run. You might even be able to take some of the learnings from this session and apply it to your on-prem environment as well!

Theater (90)

Fri 2:00 pm – 2:45 pm
- Data governance
- Analysis Services
- Azure SQL Database
- Fabric Data Warehouse
- Power BI
- Spark Notebooks
- Synapse Analytics
- Other
- Intermediate
Session presentation

stand fast.pdf
- James Dales
Realtime streaming data with Microsoft Fabric

We will use the real-time analytics capabilities of Microsoft Fabric to ingest data about aircraft captured using a Raspberry PI and an antenna from the aircraft’s own transmissions as they are flying right now!

Built live in the session, we’ll use Eventstream to stream the data into KQL Database before visualising the locations of 100s of aircraft in range with a Power BI report displayed using the Icon Map custom visual, enabling us to show the flight tracks and the aircraft’s direction of travel. We’ll be able to analyse flight numbers, types of planes and which operators are currently flying right now.

Then extending the solution using KQL’s powerful spatial analysis functions we’ll be able to explore the busiest areas of sky analysing hundreds of millions of rows of flight tracks in seconds.

Microsoft MVP James Dales has used flight data for many years to demonstrate the capabilities of his Icon Map Power BI visual. However, with Microsoft Fabric we have the opportunity to create the end-to-end solution all within the Microsoft Fabric ecosystem and in near real-time!

Strand (130)

Fri 3:15 pm – 4:00 pm
- Streaming data
- Fabric Data Activator
- Power BI
- Real-Time Analytics
- Intermediate
Session presentation

StreamingFlightData.pdf
- Wolfgang Strasser
Let’s talk about Data Governance and how Microsoft Fabric brings it to a next level.

Data Lakes, Lakehouses, Data Warehouses, Data Integration pipelines, notebooks and much more. Data is transformed, stored and reported using more and more technologies. For us as data professionials it is not easy to gain an overview about your data (analytics) landscape.
With Microsoft Purview data catalog, we got a tool to do the inventory of your data landscape, classify data and track lineage.
With Microsoft Fabric, we got an integrated SaaS analytics platform that contains (almost) everything: Data integration, data storage, data lake houses and warehouses and integrated Power BI reporting. How does this approach change in context of data cataloguing? How can we benefit from the integrated approach to get a better understanding of what happens to our data?
Join this sesion to learn more about the concepts of a data catalog, how Microsoft Purview implements this and how Data Governance and Microsoft Fabric work better together.

Festival (190)

Fri 3:15 pm – 4:00 pm
- Data governance
- Azure SQL Database
- Data Factory
- Fabric Data Engineering
- Fabric Data Warehouse
- Purview
- Power BI
- Intermediate
Session presentation

WolfgangStrasser_DPNextStep_Data Governance in Fabric and Purview.pdf
- Lisa Hoving
What my pet python taught me about Data Lakes on Azure

Have you ever looked at a snake and thought… this one needs a Data Lake? Me neither! That is, not until I got my pet Python. It was when I got him, I realized that taking care of reptiles requires special care, using very specific equipment. However, as we all know, technology is prone to failures. This got me thinking… how am I going to give my pet python the best care, day in day out, even when technology fails? As a Data Engineer, the answer was easy. My snake needs a Data Lake! … wait what? You implemented a Data Lake for your pet python? Yes!

Why? How? We will speak about it during this session. First, we will talk about what Data Lakes are, why they exist, and how I intended to use Data Lake architecture on Azure. Next, we will talk about the technology used, and the lessons learned.

All in all, it will be a talk about my favourite topics:
– Snakes
– Data Lakes
– Azure:
o IoT Hubs
o Azure Data Lake Storage
o Databricks

My pet python’s name is fluffy, by the way. Just in case you were curious 😉.

Theater (90)

Fri 3:15 pm – 4:00 pm
- Data integration
- Databricks
- Delta Lake
- Spark Notebooks
- Intermediate
Session presentation

Data Platform Next Step – novideo.pdf
- Sean Douglas Thomsen
Enhancing your Fabric Warehouse with dbt

Are you taking full advantage of Fabric’s capabilities? Data build tools (dbt) makes it easy to implement lineage, documentation, and tests while at the same time being highly customizable. Best of all it is based on our favorite data transformation tool, SQL. dbt is both relevant for the BI developer looking to migrate their data warehouses to Fabric, but also for data engineers who prefer to work primarily with SQL. Join this session to learn more about the advantages and disadvantages of using dbt for Fabric and why you should consider using dbt for your next project on Fabric.

Strand (130)

Fri 4:15 pm – 5:00 pm
- Data Warehouse
- dbt Labs
- Fabric Data Warehouse
- Intermediate
Session presentation

dbt on fabric.pdf
- Justyna Lucznik
Best practices for developing code in Fabric notebooks

This session is focused on providing developers with best practices for developing code in Microsoft Fabric with notebooks.
During the session we will cover various concepts such as when to use a notebook resource folder vs. a lakehouse, what you can use mssparkutils for, capabilities like high concurrency as well as how environments can save you time.
By the end of the session, you should be able to apply these best practices to your own data projects.

Festival (190)

Fri 4:15 pm – 5:00 pm
- Data Lakehouse
- Fabric Data Engineering
- Fabric Data Science
- Spark Notebooks
- Advanced
Session presentation

Best Practices Presentation.pptx
- Jens Vestergaard
Using Azure EventGrid w/ Fabric Notebooks

In this session we’ll cover how Azure Event Grid can be used to unify and automate your data loading workflows using Fabric Notebooks, including how messages can be utilized to communicate changes in state in the flow of data.
Azure Event Grid can for instance be used to monitor changes in the layers of a data lake and trigger downstream processing tasks, handle logging, telemetry and much more.
By utilizing messages to communicate actions within the workflows in the data lake, Azure Event Grid enables a more efficient and streamlined data processing pipeline. Data loading workflows can be automated and triggered in real-time, reducing manual intervention and improving overall efficiency.

In the context of a Fabric Notebook, we will cover the steps needed to configure and setup the “backend Azure stuff” as well as configuring the workspace to enable the link to Azure Event Grid. Once that is configured, we will explore some of your options given this capability.

In specific, we will look at how to use Azure Event Grid for
– Logging data processing events
– Logging telemetry
– Logging sample data, data statistics etc. (leveraging features from spark)

Attending this session will leave you with an introduction to Azure EventGrid and the message structure. You will also learn how to utilize this to create a framework for automating data processing in Fabric Notebooks as well as reporting statistics on top of the flows of data in you workspace.

Join me to learn about scalable automation in Microsoft Fabric using Axure EventGrid.

Theater (90)

Fri 4:15 pm – 5:00 pm
- Data integration
- Azure DevOps
- Fabric Data Engineering
- GitHub
- Logic Apps
- Power BI
- Other
- Advanced
Session presentation

Using Azure EventGrid w Fabric Notebooks.pdf

Breakout sessions

Friday, June 14th

Quest to Delta Optimisation

Microsoft Fabric Real-time Intelligence – Ingest, Analyse & Act at Speed

Authorization By Data Classification (ABC)

DataWatch: Seeing the Unseen in Data Platforms and Products through Observability

Putting Fabric to the test

Azure Data Platform as Code – can it be done it in less than a week?

Make Fabric the Gold layer of your multi-cloud medallion architecture!

Understanding Fabric Capacities

Mastering the Art: Orchestrating ADF with the Power of Managed Airflow

Lessons learnt from PySpark Notebooks and extracting APIs

Data Warehouse Performance Best Practices

Stand Fast – How Governance Can Save You a Fortune

Realtime streaming data with Microsoft Fabric

Let’s talk about Data Governance and how Microsoft Fabric brings it to a next level.

What my pet python taught me about Data Lakes on Azure

Enhancing your Fabric Warehouse with dbt

Best practices for developing code in Fabric notebooks

Using Azure EventGrid w/ Fabric Notebooks