Sifiso Ndlovu

Warehousing JSON formatted data in SQL Server 2016

June 22, 2016 by

In this article, I continue to review the exciting features available in SQL Server 2016. One such feature is the long awaited T-SQL support for JSON formatted data. In this article we take a look at how JSON support will impact data warehouse solutions.

Background

Since the advent of EXtensible Markup Language (XML) many modern web applications have focused on providing data that is both human-readable and machine-readable. From a relational database perspective, SQL Server kept up with these modern web applications by providing support for XML data in a form of an XML data type and several functions that could be used to parse, query and manipulate XML formatted data.

As a result of being supported in SQL Server, data warehouse solutions based off SQL Server were then able to source XML-based OLTP data into a data mart. To illustrate this point, let’s take a look at the XML representation of our fictitious Fruit Sales data shown in Figure 1.


Figure 1: XML representation of Fruit Sales data

To process this data in data warehouse, we would first have to convert it into relational format of rows and columns using T-SQL XML built-in functions such as the nodes() function. Figure 2 shows a script that makes use of the nodes() function to convert the sample data shown in Figure 1.


Figure 2: XML representation of Fruit Sales data

The results of the above script are shown in Figure 3 in a recognisable format for data warehouse.


Figure 3: XML data represented in a table format

Soon after XML became a dominant language for data interchange for many modern web applications, JavaScript Object Notation (JSON) was introduced as a lightweight data-interchange format that is more convenient for web applications to process than XML. Likewise most relational database vendors released newer versions of their database systems that included the support for JSON formatted data. Unfortunately, Microsoft SQL Server was not one of those vendors and up until SQL Server 2014, JSON data was not supported. Obviously this lack of support for JSON, created challenges for data warehouse environments that are based off SQL Server.

Although there were workarounds (i.e. using Json.Net) to addressing the lack of JSON support in SQL Server, there was always sense that these workarounds were inadequate, time-wasting, and were forcing data warehouse development teams to pick up a new skill (i.e. learn .Net). Fortunately, the release of SQL Server 2016 has ensured that development teams can throw away their JSON workarounds as JSON is supported in SQL Server 2016.

Parsing JSON Data into Data Warehouse

Similarly to XML support in SQL Server, SQL Server supports of JSON can be classified into two ways:

  1. Converting Relational dataset into JSON format

  2. Converting JSON dataset into relational format

However, for the purposes of this discussion we are focusing primarily on the second part – which is converting a JSON formatted data (retrieved from OLTP sources) into a relational format of rows and columns. To illustrate our discussion points we once again make use of the fictitious fruit sales dataset. This time around the fictitious dataset has been converted into a JSON format as shown in Figure 4.


Figure 4: JSON representation of Fruit Sales dataset

ISJSON function

As part of supporting JSON formatted data in other relational databases such as MySQL and PostgreSQL 9.2, there is a separate JSON data type that has been introduced by these vendors. Amongst other things, JSON data type conducts validation checks to ensure that values being stored are indeed of valid JSON format.

Unfortunately, SQL Server 2016 (and ORACLE 12c) do not have a special data type for storing JSON data instead a variable character (varchar/nvarchar) data type is used. Therefore, a recommended practice to dealing with JSON data in SQL Server 2016 is to firstly ensure that you are indeed dealing with a valid JSON data. The simplest way to do so is to use the ISJSON function. This is a built-in T-SQL function that returns 1 for a valid JSON dataset and 0 for invalids.

Figure 5 shows us the implementation of ISJSON function whereby we validate our fictitious sample dataset.


Figure 5: Validating our fictitious JSON data using ISJSON

OPENJSON function

Now that we have confirmed that we are working with a valid JSON dataset, the next step is to convert the data into a table format. Again, we have a built-in T-SQL function to do this in a form of OPENJSON. OPENJSON works similar to OPENXML in that it takes in an object and convert its data into rows and columns.

Figure 6 shows a complete T-SQL script for converting JSON object into rows and columns.


Figure 6: OPENJSON T-SQL sample script

Once we execute the above script, we get relational output shown in Figure 7.


Figure 7: OPENJSON T-SQL output

Now that we have our relational dataset, we can process this data into data warehouse.

JSON_VALUE function

Prior to concluding our discussion of JSON in SQL Server 2016, it is worth mentioning that in addition to OPENJSON, you have other functions such as JSON_VALUE that could be used to query JSON data. However this function returns a scalar value which means that unlike the multiple rows and columns returned using OPENJSON, JSON_VALUE returns a single value as shown in Figure 8.


Figure 8: Single value output using JSON_VALUE function

If you the JSON object that you are querying doesn’t have multiple elements, than you don’t have to specify the row index (i.e. [0]) as shown in Figure 9.


Figure 9: JSON_VALUE T-SQL script without row index

Conclusion

The long wait is finally over and with the release of SQL Server 2016, JSON is now supported. Similarly to XML, T-SQL support the conversion of JSON object to relational format as well the conversion of relational tables to a JSON object. This support is implemented via built-in T-SQL functions such as OPENJSON and JSON_VALUE. In spite of all the excitement with the support of JSON is SQL Server 2016, we still don’t have a JSON data type. The ISJSON function can then be used to validate JSON text.

Downloads

Reference


Sifiso Ndlovu
168 Views