Querying remote data sources in SQL Server

A common activity when writing T-SQL queries is connecting to local databases and processing data directly. But there will be situations in which you need to connect to a remote database that is located in a different instance in the same server or in a different physical server, and process its data in parallel with the local data processing.

SQL Server provides us with four useful methods to connect to the remote database servers, even other database server types, and query its data within your T-SQL statement. In this article, we will discuss these four methods and how to use it to query remote SQL Server databases.

OPENDATASOURCE

The first method to query a remote SQL Server database is the OPENDATASOURCE T-SQL function below:

OPENDATASOURCE ( provider_name as char, init_string )

Where the provider_name is the OLE DB provider used to access the data source. And the init_string is the connection string of the remote server.

To be able to use the OPENDATASOURCE statement, you need to make sure that the DisallowAdhocAccess registry key is set to 0 for the provider you want to connect to other than the SQL Server, which can be found in the below path of the Registry Keys :

HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\MSSQLServer\Providers\<ProviderName>

Also you need to enable the Ad Hoc Distributed Queries advanced configuration option which is disabled by default in SQL Server. If you try to run the below simple query that is using the OPENDATASOURCE T-SQL statement you will get the error:

SELECT *

FROM OPENDATASOURCE('SQLNCLI',

'Data Source=DEV_SQL;Integrated Security=SSPI')

.testdb.dbo.AddressBook

Msg 15281, Level 16, State 1, Line 1
SQL Server blocked access to STATEMENT ‘OpenRowset/OpenDatasource’ of component ‘Ad Hoc Distributed Queries’ because this component is turned off as part of the security configuration for this server. A system administrator can enable the use of ‘Ad Hoc Distributed Queries’ by using sp_configure. For more information about enabling ‘Ad Hoc Distributed Queries’, search for ‘Ad Hoc Distributed Queries’ in SQL Server Books Online.

As you can see from the error message, the Ad Hoc Distributed Queries advanced configuration option should be enabled in order to open connection to a remote server using the OPENDATASOURCE. This can be achieved by using the sp_configure query below:

EXEC sp_configure 'show advanced options', 1

RECONFIGURE

EXEC sp_configure 'ad hoc distributed queries', 1

RECONFIGURE

Once the Ad Hoc Distributed Queries advanced configuration option is enabled, the previous query will run successfully. OPENDATASOURCE can replace the server name in the four-part name of the table or view in a SELECT, INSERT, UPDATE, or DELETE statement. It can be also used in the EXECUTE statement to run a remote stored procedure.

OPENROWSET

The second way to query a database hosted in a remote SQL Server is the OPENROWSET T-SQL function. In order to use the OPENROWSET ad hoc method, you need to provide all connection information that is required to connect to the remote SQL server and many other resources. It also can be used for a bulk operation through the built-in BULK provider to read data from files. OPENROWSET is used in the FROM clause as a table name in the SELECT, INSERT, UPDATE, or DELETE statements. Although the query might return multiple result sets, OPENROWSET returns only the first one.

Using OPENROWSET requires enabling the Ad Hoc Distributed Queries advanced configuration option same as the OPENDATASOURCE function.

You need to provide the provider name, the connection string and the query as follows:

1	OPENROWSET('providername', 'datasource','query’)

You can write the previous OPENDATASOURCE query using the OPENROWSET function as follows:

SELECT a.*

FROM OPENROWSET('SQLNCLI', 'Server=DEV_SQL;Trusted_Connection=yes;',

'SELECT * FROM testdb.dbo.Profile') AS a;

Linked Server

A SQL Server Linked Server is used to access remote OLE DB data sources such as a SQL Server instance located outside the SQL Server or other systems such as Oracle, Microsoft Access and Excel, and execute the distributed T-SQL queries against them.

A SQL Server Linked Server is different from ad hoc queries in that ad hoc queries open a temporary connection with the remote server and close it, where the permanent linked server is always available for use. When the user executes a distributed query against a remote data source using a linked server, the SQL Server Engine parses that command and sends the requests to OLE DB. This request can be a query to execute or a table to be opened on that remote server.

A Linked Server can be configured by using SQL Server Management Studio or using the sp_addlinkedserver T-SQL statement.

In order to configure a linked server using SQL Server Management Studio, expand the Server Objects node from the Object Explorer window. Right-click on the Linked Server node and choose New Linked Server.

In the General tab of the New Linked Server window, choose a name for your linked server, then choose the type of the server you need to connect to using that linked server. Select SQL Server if you manage to connect to a remote SQL Server instance, or choose Other Data Source to select from the available OLE DB server types from the Provider drop down list other than SQL Server. If you choose SQL Server as the Server Type, the Linked Server name should be the network name of the remote SQL Server.

Fill the Product Name field with the product name of the selected OLE DB data source, such as SQL Server if you are connecting to a remote SQL Server. Type the name of the selected data source in the Data Source field, such as the SQL Server Instance name if you are connecting to a remote SQL Server instance. Fill the Provider String field with the OLE DB provider-specific connection string that identifies a unique data source. The Location field can be filled with the remote database location as interpreted by the OLE DB provider. Type the name of the catalog to be used by the OLE DB provider in the Catalog field.

You are not required to use all the described arguments together, as the necessary arguments depend on the selected provider. For example, using SQL Server provider, you will use only two arguments.

On the Security page of the New Linked Server window, specify the security context that the linked server will use to connect the original SQL Server to the remote data source. As the name explains, in the Local server logins to remote server logins mappings part of the Security window, you can specify a list of users that can use the linked server to connect to the remote server by mapping these local users with remote server logins. These users can be via SQL Server Authentication or a Windows Authentication login.

For the logins that haven’t been defined in the mapping list, you have to choose the security context for their connections to the remote server using that linked server. Choosing Not be made will prevent any user not included in the previous list from using that linked server. Selecting Be made without using a security context, the users not included in the previous mapping list will connect to the remote server using that linked server without specifying a security context for them. If you choose Be made using the login’s current security context, then the connection to the remote server will be established using the connecting user, which is the best choice. In the Be made using this security context option, specify the SQL Server Authentication credentials that will be used to establish connection to the remote server for the users not defined in the mapping list.

There are many options that you can tune depending on your requirements from the Server Options page of the New Linked Server window. For example, set the Collation Compatible option to true if you are sure that the remote data source has the same character set and sort order as the local server. By default, SQL Server evaluates comparisons on character columns locally if you don’t use this option. The Data Access option is used to enable and disable the distributed query access for the linked server. To enable the remote procedure call from the specified server set the RPC to true, and to enable the remote procedure call to the specified server set the RPC Out to true.

Deciding if the local or remote server’s collation will be used in the query is determined by the Use Remote Collation option. If this option’s value is true, you can specify the collation name that will be used by the remote server in the Collation Name option. This is applicable if the remote data source is not SQL Server, where you can specify any collation name supported by SQL Server in that field.

You can override the default server’s remote login timeout for that linked server by changing the Connection Timeout option to any value larger than 0.This value specifies the time-out in seconds for connecting to the linked server. Also you can specify the time-out in seconds for the queries connecting to the linked server by changing the Query Timeout option to any value larger than 0. This will override the server’s remote query timeout for the linked server.

Another useful option that is used to start a distributed transaction when calling a remote stored procedure, in which this transaction will be manage and protect by the MS DTC. This option is called Enable Promotion of Distributed Transactions.

You can make sure that the linked server is working fine by right-clicking on that linked server and choose Test Connection as follows:

If the connection to the remote server is opened successfully, you will receive the below message, otherwise an error message will be displayed showing that there is something preventing the connection from being opened:

You can easily create the previous linked server using the sp_addlinkedserver T-SQL statement passing the required arguments as follows:

USE [master]

EXEC master.dbo.sp_addlinkedserver @server = N'DEV_SQL', @srvproduct=N'SQL Server'

Once the linked server is created successfully, you will be able to use it by specifying the four-part name that includes: Linked_Server_Name.Remote_Database_Name.Schema_Name.Table_Name as in the below example:

1 2	SELECT * FROM DEV_SQL.testdb.dbo.Profile GO

OPENQUERY

The last SQL Server method that is used to connect to a remote data source is the OPENQUERY function. It is an alternative one-time ad hoc method to connect to a remote server using the linked server. For more frequent connections to the remote server, it is better to use the linked server instead of the OPENQUERY function.

The OPENQUERY function can be used in the FROM clause of the SELECT, INSERT, UPDATE, or DELETE statement replacing the table name. It takes two arguments; the linked server name and the query. These parameters can’t be variable, as follows:

OPENQUERY ( linked_server ,’query’ )

Below is a simple example of the OPENQUERY usage:

1	SELECT * FROM OPENQUERY(DEV_SQL,'SELECT * FROM testdb.dbo.Profile')

Comparison

When using the Linked Server to query a remote server, the query optimizer will create the execution plan after classifying and dividing the query into local and remote queries, where the local queries will be executed locally and the remote queries will be sent to the remote server, then combined together to display the final result to the user as single result set. Another disadvantage of the Linked Server is that, no filtering will be applied on the remote server if the query has WHERE clause, where it will retrieve all the records from the remote table and do the filtering and the joining locally.

In the OpenQuery function case, the SQL Engine will not try to classify the query or check what it will do, simply it will send the query as is to the remote server. Parsing the SQL query, generating the execution plan and all filtering will be performed on the remote server.

In general, OpenQuery is faster than the linked server as the SQL Engine will not break the query before sending it to the remote server, but useful only for one-time less frequent remote connections.

Using the OPENROWSET and OPENDATASOURCE functions, you have to specify all the connection details including the username and password each time you use it. Although these functions don’t provide all linked server functionality such as the security management, it consumes less resources from your server. As these functions open a one-time connection to the remote server, it is better to use the linked server for frequent remote server access.

See more

To boost SQL coding productivity, check out these SQL tools for SSMS and Visual Studio including T-SQL formatting, refactoring, auto-complete, text and data search, snippets and auto-replacements, SQL code and object comparison, multi-db script comparison, object decryption and more

Author
Recent Posts

Ahmad Yaseen

Ahmad Yaseen is a Microsoft Big Data engineer with deep knowledge and experience in SQL BI, SQL Server Database Administration and Development fields.

He is a Microsoft Certified Solution Expert in Data Management and Analytics, Microsoft Certified Solution Associate in SQL Database Administration and Development, Azure Developer Associate and Microsoft Certified Trainer.

Also, he is contributing with his SQL tips in many blogs.

View all posts by Ahmad Yaseen