This article will explain ArangoDB, how to install it on Windows, and how to migrate a SQL Server graph database to this NoSQL database management system.
This article is the fifth article of the NoSQL databases series that aims to explain different NoSQL technologies and how to integrate them with SQL Server.
- Note that in this article, we will use Visual Studio 2019. And the source SQL Server database is built based on this article, Understanding graph databases in SQL Server, published previously on SQL Shack
Introduction
As mentioned in this series’s previously published article, the SQL Server graph database is not a native graph; Microsoft added the graph database capabilities on the top of the relation database model. While the hybrid approach provided in SQL Server may help some use cases, especially when we need to integrate graphs with relational tables, native graph databases are still more powerful and reach a higher maturity level.
In the previous articles, we explained how to migrate SQL Server graph databases to the Neo4j graph database. This article is intended for the ArangoDB developers to help them in performing this migration process.
What is ArangoDB?
ArangoDB is an open-source multi-model NoSQL database. It supports three data models:
- Document database
- Key-Value database
- Graph database
The multi-model paradigm allows users to combine each data model advantage within one context. For example, it allows creating nested documents within a graph database or benefiting from the key-value pairs’ high performance in a graph-connected environment.
This database management system was first released in 2011 as AvocadoDB, then renamed to ArangoDB in 2012. As shown in the GitHub repository, ArangoDB is written in C++ (46.4%) and Javascript (48.8%).
Figure 1 – Programming languages used in the ArangoDB GitHub repository
In ArangoDB, data is stored as JSON and can be queried using the ArangoDB query language (AQL).
How to install ArangoDB on Windows?
To install ArangoDB, first, we should open a web browser and navigate to the ArangoDB website. Then, we should click on the “Download” button located in the top right corner.
Figure 2 – ArangoDB website
If we are looking to download the free edition, we should click on the “Get Community edition” link located below the “Download Enterprise Edition” button on the download page.
Figure 3 – Download community edition
Now, we should click on the Windows icon.
Figure 4 – Select Windows operating system
Now, click on the ArangoDB Server NSIS package to download the installation file.
Figure 5 – Downloading ArangoDB Server installation file
Once the file download is complete, we should open it to start the installation. In the installation wizard, you have first to agree with the end-user license agreement. If you want to change the ArangoDB installation path, you can check the “Choose custom install paths for databases and installation” option.
Figure 6 – Installation configuration form
Then, you can enter the ArangoDB root user password (optional).
Figure 7 – Changing root user password
You can keep the “Launch ArangoDB?” option checked to launch the web interface after closing the wizard at the end of the installation.
Figure 8 – Installation complete form
In the web interface, you are asked to login into ArangoDB. For the first time, you should use the “root” user with the password you entered in the installation (use blank if no password was specified).
Figure 9 – Login form
Next, you have to select the database. For the first time, only _system database is available.
Figure 10 – Selecting the current database
To add a new database, in the left side menu, click on “Databases”, then click on the “Add Database” button as shown in the image below.
Figure 11 – Adding a new database
Next, we should set the new database name (In this tutorial, we will set it to “TestGraph”).
Figure 12 – Setting the database name
To change the current database, click on the database name on the top-right corner of the page. This will get you back to the database selection form.
Figure 13 – Changing the current database
Exporting SQL Server graph databases to ArangoDB
After illustrating the ArangoDB community edition installation, we will explain how to migrate SQL Server graph databases to ArangoDB using C#.
Getting things ready
First, let us start by creating a C# .Net Core console application in Visual Studio (steps explained in this article: Migrating SQL Server graph databases to Neo4j)
Next, we should open the Nuget package manager in Visual Studio, search for ArangoDB, select ArangoDB.Client NuGet package. After selecting ArangoDB.Client package, you should check the SQLToArangoDB project, then press on the Install button as shown in the image below.
Figure 14 – Installing the ArangoDB.Client NuGet package
The same steps should be repeated with the System.Data.SqlClient Nuget package (since we are using .Net Core and this package is not included – like it was in the .Net framework).
- Note that there are different ArangoDB. Net client NuGet packages, but we used the ArangoDB.Client since it is the most popular package. Moreover, there is no official .NET client provided by ArangoDB; the ArangoDB community provides all drivers. You can check all available drivers on the following page: ArangoDB database client drivers (Java, JavaScript, Go, Python, PHP)
Reading from SQL Server graph database
As explained in the previously published article Migrating SQL Server graph databases to Neo4j, to read and write columns, nodes, and edges information, we will create three classes as follows:
Column.cs
1 2 3 4 5 6 7 8 9 10 11 12 13 |
namespace SqlToArangoDB { public class Column { public string Name { get; set; } public string DataType { get; set; } public Column(string name, string type) { Name = name; DataType = type; } } } |
Node.cs
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
using System.Collections.Generic; namespace SqlToArangoDB { public class Node { public string ID { get; set; } public string Label { get; set; } public Dictionary<string, object> Properties { get; set; } public Node(string id, string label, Dictionary<string, object> properties = null) { ID = id; Label = label; if (properties != null) Properties = properties; else Properties = new Dictionary<string, object>(); } } } |
Edge.cs
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 |
using System.Collections.Generic; namespace SqlToArangoDB { public class Edge { public string ID { get; set; } public string Label { get; set; } public Dictionary<string, object> Properties { get; set; } public string FromNode { get; set; } public string FromNodeID { get; set; } public string ToNode { get; set; } public string ToNodeID { get; set; } public Edge(string id, string label, string from, string fromID, string to, string toID, Dictionary<string, object> properties = null) { ID = id; Label = label; FromNode = from; FromNodeID = fromID; ToNodeID = toID; ToNode = to; if (properties != null) Properties = properties; else Properties = new Dictionary<string, object>(); } } } |
To read data from the SQL Server database, we created a class called SQLReader.cs. The only parameter the user needs to pass in this class constructor is the connection string (SQL Server instance + database + authentication).
This class contains two functions GetNodes() and GetEdges(), that are used to read nodes and edges from the SQL Server graph database.
SQLReader.cs
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 |
using System; using System.Collections.Generic; using System.Linq; using System.Data; using System.Data.SqlClient; namespace SqlToArangoDB { public class SQLReader : IDisposable { public List<Node> Nodes { get; set; } public List<Edge> Edges { get; set; } public string ConnectionString { get; set; } public SQLReader(string connection) { ConnectionString = connection; } public void GetNodes() { Nodes = new List<Node>(); //initialize connection using (SqlConnection sqlcon = new SqlConnection(ConnectionString)) { sqlcon.Open(); //retrieving nodes tables using (SqlCommand sqlnodes = new SqlCommand("select name from sys.tables where is_node = 1", sqlcon)) { SqlDataReader tablesreader; tablesreader = sqlnodes.ExecuteReader(); List<string> tables = new List<string>(); //get nodes tables while (tablesreader.Read()) tables.Add(tablesreader[0].ToString()); tablesreader.Close(); foreach (string tablename in tables) { //Get columns with data types string cmdColumns = String.Format("select column_name,data_type from information_schema.columns columns where table_name = '{0}'" + "and exists(select 1 from information_schema.columns temp where temp.column_name like 'graph_id_%'" + "and temp.TABLE_SCHEMA = columns.table_schema and temp.TABLE_NAME = columns.TABLE_NAME)", tablename); List<Column> columns = new List<Column>(); using (SqlCommand sqlcmd = new SqlCommand(cmdColumns, sqlcon)) { SqlDataReader columnsreader = sqlcmd.ExecuteReader(); while (columnsreader.Read()) { columns.Add(new Column(columnsreader[0].ToString(), columnsreader[1].ToString())); } columnsreader.Close(); } string idcolumn = columns.Where(x => x.Name.StartsWith("$node_id")).FirstOrDefault().Name; string propColumns = string.Join(",", columns.Select(x => x.Name).Where(y => !y.StartsWith("$") && !y.StartsWith("graph_id_"))); string cmdNodes = "select JSON_VALUE([" + idcolumn + "],'$.id') as node_id " + (propColumns == "" ? "" : "," + propColumns); cmdNodes = cmdNodes + string.Format(" from {0}", tablename); //get nodes using (SqlCommand sqlcmd = new SqlCommand(cmdNodes, sqlcon)) { SqlDataReader nodesreader = sqlcmd.ExecuteReader(); //Get properties while (nodesreader.Read()) { Dictionary<string, object> properties = new Dictionary<string, object>(); foreach (Column col in columns.Where(x => !x.Name.StartsWith("$") && !x.Name.StartsWith("graph_id_"))) { properties.Add(col.Name, nodesreader[col.Name]); } properties.Add("node_id", nodesreader["node_id"].ToString()); Nodes.Add(new Node(nodesreader["node_id"].ToString(), tablename, properties)); } nodesreader.Close(); } } } } } public void GetEdges() { Edges = new List<Edge>(); //initialize connection using (SqlConnection sqlcon = new SqlConnection(ConnectionString)) { sqlcon.Open(); //retrieving nodes tables using (SqlCommand sqlnodes = new SqlCommand("select name from sys.tables where is_edge = 1", sqlcon)) { SqlDataReader tablesreader; tablesreader = sqlnodes.ExecuteReader(); List<string> tables = new List<string>(); //get edges tables while (tablesreader.Read()) tables.Add(tablesreader[0].ToString()); tablesreader.Close(); foreach (string tablename in tables) { //Get columns with data types string cmdColumns = String.Format("select column_name,data_type from information_schema.columns columns where table_name = '{0}'" + "and exists(select 1 from information_schema.columns temp where temp.column_name like 'graph_id_%'" + "and temp.TABLE_SCHEMA = columns.table_schema and temp.TABLE_NAME = columns.TABLE_NAME)", tablename); List<Column> columns = new List<Column>(); using (SqlCommand sqlcmd = new SqlCommand(cmdColumns, sqlcon)) { SqlDataReader columnsreader = sqlcmd.ExecuteReader(); while (columnsreader.Read()) { columns.Add(new Column(columnsreader[0].ToString(), columnsreader[1].ToString())); } columnsreader.Close(); } string idcolumn = columns.Where(x => x.Name.StartsWith("$edge_id")).FirstOrDefault().Name; string fromid = columns.Where(x => x.Name.StartsWith("$from_id")).FirstOrDefault().Name; string toid = columns.Where(x => x.Name.StartsWith("$to_id")).FirstOrDefault().Name; string propColumns = string.Join(",", columns.Select(x => x.Name).Where(y => !y.StartsWith("$") && !y.StartsWith("graph_id_") && !y.StartsWith("from_") && !y.StartsWith("to_"))); string cmdNodes = "select JSON_VALUE([" + idcolumn + "],'$.id') as edge_id " + ",JSON_VALUE([" + fromid + "],'$.id') as from_id " + ",JSON_VALUE([" + fromid + "],'$.table') as from_table " + ",JSON_VALUE([" + toid + "],'$.id') as to_id " + ",JSON_VALUE([" + toid + "],'$.table') as to_table" + (propColumns == "" ? "" : "," + propColumns); cmdNodes = cmdNodes + string.Format(" from {0}", tablename); using (SqlCommand sqlcmd = new SqlCommand(cmdNodes, sqlcon)) { SqlDataReader edgesreader = sqlcmd.ExecuteReader(); //Get properties while (edgesreader.Read()) { Dictionary<string, object> properties = new Dictionary<string, object>(); foreach (Column col in columns.Where(y => !y.Name.StartsWith("$") && !y.Name.StartsWith("graph_id_") && !y.Name.StartsWith("from_") && !y.Name.StartsWith("to_"))) { properties.Add(col.Name, edgesreader[col.Name]); } Edges.Add(new Edge(edgesreader["edge_id"].ToString(), tablename, edgesreader["from_table"].ToString(), edgesreader["from_id"].ToString(), edgesreader["to_table"].ToString(), edgesreader["to_id"].ToString(), properties)); } edgesreader.Close(); } } } } } public void Dispose() { Nodes = null; Edges = null; ConnectionString = null; } } } |
Writing Nodes and Edges to ArangoDB
To write exported nodes and edges from SQL Server into the ArangoDB database, we created ArangoDBWriter.cs class. This class takes the following parameters:
- ArangoDB service URI (default is http://localhost:8529)
- Database name
- User name
- Password
We also defined two functions ImportNodes() and ImportEdges() to import nodes and edges into the ArangoDB database.
ArangoDBWriter.cs
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 |
using System; using System.Collections.Generic; using System.Linq; using ArangoDB.Client; using System.Net; using Newtonsoft.Json; namespace SqlToArangoDB { public class ArrangoDbWriter : IDisposable { public string url { get; set; } public string Database { get; set; } public string UserName { get; set; } public string Password { get; set; } public IArangoDatabase db { get; set; } public ArrangoDbWriter(string url, string Database, string UserName, string Password) { ArangoDatabase.ChangeSetting(s => { s.Database = Database; s.Url = url; s.Credential = new NetworkCredential(UserName, Password); s.SystemDatabaseCredential = new NetworkCredential(UserName, Password); }); db = ArangoDatabase.CreateWithSetting(); } public void ImportNodes(List<Node> nodes) { foreach (Node node in nodes) { if(db.ListCollections().Where(x => x.Name == node.Label).Count() == 0) db.CreateCollection(node.Label, type: CollectionType.Document); var obj = node.Properties; obj.Add("_key", node.ID); string json = JsonConvert.SerializeObject(obj); db.Collection(node.Label).Insert(obj); } } public void ImportEdges(List<Edge> edges) { foreach (Edge edge in edges) { if (db.ListCollections().Where(x => x.Name == edge.Label).Count() == 0) db.CreateCollection(edge.Label, type: CollectionType.Edge); var obj = edge.Properties; obj.Add("_key", edge.ID); obj.Add("_from",edge.FromNode + @"/" + edge.FromNodeID); obj.Add("_to",edge.ToNode + @"/" + edge.ToNodeID); string json = JsonConvert.SerializeObject(obj); db.Collection(edge.Label).Insert(obj); } } public void Dispose() { url = null; Database = null; UserName = null; Password = null; } } } |
Importing data into ArangoDB
After creating the classes we mentioned above, we will use the following code in the “Program.cs” class to import data from an SQL Server graph database (named GraphPL) into the ArangoDB database (named TestGraph).
Program.cs
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 |
namespace SQLToArangoDB { public class Program { static void Main(string[] args) { using (SQLReader reader = new SQLReader("Server=<sql server instance>;Database=GraphPL;Trusted_Connection=yes;")) { reader.GetNodes(); reader.GetEdges(); using (ArrangoDbWriter writer = new ArrangoDbWriter("http://localhost:8529", "TestGraph", "root", "")) { writer.ImportNodes(reader.Nodes); writer.ImportEdges(reader.Edges); } } } } } |
To ensure that all nodes and edges are exported successfully to the ArangoDB database, let us open the ArangoDB web interface and then click on the Collections tab. We should see all nodes and edges listed in the Collection tab, as shown in the image below.
Figure 15 – Nodes and Edges exported to ArangoDB
If we click on a collection, we can see all the nodes or edges imported within it (the image below shows the nodes imported into the Cities collection).
Figure 16 – Nodes imported into the Cities collection
Conclusion
This article explained briefly the ArangoDB database and how to install it on the Windows operating system. Then, it provided a step-by-step guide to migrate nodes and edges from a SQL Server graph database to an ArangoDB database using C# .Net core application. The code we provided in this article was published on GitHub and may need some improvements.
Table of contents
- An overview of SQL Server monitoring tools - December 12, 2023
- Different methods for monitoring MongoDB databases - June 14, 2023
- Learn SQL: Insert multiple rows commands - March 6, 2023