Monday, June 22, 2015

Polybase in SQL Server 2016

Polybase allows you to access the data stored in Hadoop by using T-SQL.
Sqoop Limitations
PolyBase provides the ability to integrate a Hadoop cluster with SQL Server, which will allow you to query the data in a Hadoop Cluster from SQL Server.
 
While the Apache environment provided the Sqoop application to integrate Hadoop with other relational databases, it wasn’t really enough. With Sqoop, the data is actually moved from the Hadoop cluster into SQL Server, or the relational database of your choice. This is problematic because you needed to know before you ran Sqoop that you had enough room within your database to hold all the data.
Polybase – Hadoop Integration with SQL Server
Unlike Sqoop, PolyBase does not load data into SQL Server. Instead it provides SQL Server with the ability to query Hadoop while leaving the data in the HDFS clusters.
As Hadoop is schema-on-read, within SQL server you generate the schema to apply to your data stored in Hadoop. After the table schema is known, PolyBase provides the ability to then query data outside of SQL Server from within SQL Server. 

Using PolyBase it is possible to integrate data from two completely different file systems, providing freedom to store the data in either place.

For more details regarding how we can pull the data from Hadoop to SQL Server, refer Polybase in SQL 2016. 

Labels: , , ,

0 Comments:

Post a Comment

Subscribe to Post Comments [Atom]

<< Home