The new release of SQL Server; SQL Server 2019 includes Apache Spark and Hadoop Distributed File System (HDFS) for scalable compute and storage. This new architecture that combines together the SQL Server database engine, Spark, and HDFS into a unified data platform is called a “big data cluster.”
In this one day workshop you will learn the architecture of the Big Data Cluster (BDC), and the various components of a BDC. You will see how to create external tables against other data sources than SQL Server, and how to use Spark to run big queries over your data in HDFS or do data preparation. We also look at how to use Notebooks to run code for Spark as well as SQL.