Polybase in SQL Server 2016
Polybase
allows you to access the data stored in Hadoop by using T-SQL.
Sqoop
Limitations
PolyBase
provides the ability to integrate a Hadoop cluster with SQL Server,
which will allow you to query the data in a Hadoop Cluster from SQL
Server.
While
the Apache environment provided the Sqoop application
to integrate Hadoop with other relational databases, it wasn’t
really enough. With Sqoop, the data is actually moved from the Hadoop
cluster into SQL Server, or the relational database of your choice.
This is problematic because you needed to know before you ran Sqoop
that you had enough room within your database to hold all the data.
Polybase
– Hadoop Integration with SQL Server
Unlike
Sqoop, PolyBase does not load data into SQL Server. Instead it
provides SQL Server with the ability to query Hadoop while leaving
the data in the HDFS clusters.
As
Hadoop is schema-on-read, within SQL server you generate the schema
to apply to your data stored in Hadoop. After the table schema is
known, PolyBase provides the ability to then query data outside of
SQL Server from within SQL Server.
Using
PolyBase it is possible to integrate data from two completely
different file systems, providing freedom to store the data in either
place.
For
more details regarding how we can pull the data from Hadoop to SQL
Server, refer Polybase in SQL 2016.
Labels: new features in SQL server 2016, new in SQL Server 2016, Polybase, sql server 2016 new features
0 Comments:
Post a Comment
Subscribe to Post Comments [Atom]
<< Home