Connect to SQL Server From R

Updated on January 26, 2018
Harsh Diwan profile image

Harsh is a SQL DBA having experience of over 13 years. He has worked on most technologies related to SQL Server. He loves politics , reading

Connect to SQL Server from R

Connect from R to SQL Server
Connect from R to SQL Server

What this article is about

Most of today’s data is stored in relational databases and R needs a way to access that data.

While you are like to get lot of data from data files like comma separated or tab separated files, many times your business data might be stored in a relational database like SQL Server or Oracle or MySQL. At times, your company may copy this data in your desired format like comma separated files and give it to you.

But many times, they can simply give you read access to the database and ask you to get that data from the database directly. In such a scenario, you need to be able to connect to the relational database storing the data. You also need to have at least some basic knowledge about Relational databases and T-SQL like Select , joins , where clause etc. Basically you need to be able to write SQL queries.

This article will just demonstrate how to connect to SQL Server from R and execute a simple query.

This is not an article about writing SQL queries.

What you need

To Connect to SQL Server, you need to know 3 bits of information even before you start.

  • IP address of SQL Server:
  • User name to connect to SQL Server
  • Password for the user to connect

You can ask this information from the team of DBAs that manage the servers on which the data is residing.

Once you have this information ready, now we can start setting up our connection to SQL Server from R.

RODBC Package

There are many packages that help you connect to relational databases in R. For this demonstration, I am using a package named RODBC. First you need to install this package.

install.packages("RODBC")

Make sure that this installation is completed without any error. You can also confirm if that package has installed correctly or not by running the following command.

library(RODBC)
Load RODBC Library in R
Load RODBC Library in R

You should see the command prompt without any error. If this happens it means the library is loaded in R. As you can see that it is an ODBC driver for connecting to relational database. The same library can be used to connect to basically any data source that supports ODBC connections.

Create a data source in Windows

So to connect to SQL Server, first we will create a data source (DSN) in windows.

(I do understand that many people might be using R in Linux. I am afraid since I don’t have Linux machine myself, I can’t do this for Linux users.)

To create a data source, click on Start à Administrative Tools à Data Sources (ODBC)

Search for DSN on windows
Search for DSN on windows

This will open the following window where you will be able to create the ODBC data source that points to the SQL Server hosting the data.

As you can see, I have highlighted in yellow a data source that I have already created. I have named it as UAT.

Existing DSN
Existing DSN

Click on Add to create a new data source and select the option shown in the screen shot below and click on finish.

Select Drivers for data source
Select Drivers for data source

In the next window, you need to fill two fields. One is the name for this data source. This will be used in R. In my case, I have this data source name of UAT.

Description can be anything.

Server is the most important thing. This is where you enter the IP address of the SQL Server where the data is hosted. Please note that for confidentiality purpose, I have erased the IP address that I entered when I create this data source.

Once you have entered this information, click on Next.

Provide SQL Server information
Provide SQL Server information

On this screen, you need to tell the data source how to connect to SQL Server. This is where you will need to specify the user name and the password for the SQL Server that you may have got from the DBA.

Once again, I have scrubbed out my user ID from the next screen shot. Click on next after entering the user name and the password.

Provide SQL Server Authentication information
Provide SQL Server Authentication information

On this screen you don’t need to change anything and just click on next.

No need to take any action here. Just click next
No need to take any action here. Just click next

On the next screen too, don’t make any change and just click on next. Now you will come to the final screen.

Test Data Source
Test Data Source

Here you will see UAT instead of UST in my case. Since I already had a data source named, I had to change the name to UST. Also, in the server field marked in yellow in screen shot, I have scrubbed out the IP address of my Server. In your case, you will see the IP address of the server you entered at the start here.

Now to make sure that the data source is working, click on Test Data Source button. If everything goes right, you should see a screen like this.

DSN successfully tested
DSN successfully tested

Use DSN in R

Well, this was the hard part. Now, we just have to use this data source (DSN in R) to connect to it and run a simple query.

Now we will run the next commands in R. This first command will create a connection object pointing to the DSN that we created earlier. We need to use the same name UAT that we used for the DSN.

uat_conn = odbcConnect("UAT")

In the second step, we will use the connection object created in the previous step and get data from SQL server.

In the second step, we will use the connection object created in the previous step and get data from SQL server. This is the query that I am going to run. This will give me the list of databases from the server.

sqlQuery(uat_conn,"select name from master.sys.sysdatabases where dbid > 4")

Here the query that I am running is enclosed in quotes.

The data source against which this query needs to be run is uat_conn which we had created in the first step. So basically this R command connects to the data source (DSN) object named uat_conn and runs a query. The query is as shown below

select name from master.sys.sysdatabases where dbid > 4

Once you run this in R, you should see the following screen shot.

SQL Server data in R
SQL Server data in R

Summary

So this command from R will connect to the DSN name UAT (This points to the SQL Server which you used when you created the DSN) , gets a list of databases from that server and display it in R.

© 2018 Harshvardhan Diwan

Comments

    0 of 8192 characters used
    Post Comment

    No comments yet.

    working