Integrating Neo4J with SQL Server

After evaluating several graph databases, I decided to put more emphasis on Neo4J. Neo4J seems to be better documented and has a better REST API than the others. But still, I did not want to move all my relational data to Neo4J and there was actually no need to do so. I only needed the graph functionality of Neo4J to retrieve the user relationships as a resultset and merge it with my good old relational data. So I decided to go on to my evaluation by preparing a small .Net assembly to access Neo4J REST API and make use of the  CLR functionality in SQL Server.


Let me briefly give you the description of the scenario :

  • Both databases are installed in the same server, localhost,
  • A table called “Users” is created in SQL Server with a primary key of autoincrement column named ID, plus the other columns,
  • Neo4J holds both the profile nodes and their relationships, Each Neo4J node has a property called bspID which maps to SQL Server ID column, and of course also the Neo4J relationships connecting nodes to each other.
  • For the time being there is no need to think about the synchronization between SQL Server and Neo4J,
  • Also I will not cover the performance issues here, what I really want is, to be able to write a TSQL code like the following:
select * from dbo.neo4j_traverse(1) neo, Users u where neo.bspID=u.ID

In order to achieve this, first, we need a class library to access Neo4J REST API, then a CLR class library and some TSQL code to get the assembly installed in SQL Server.

Accessing Neo4J REST API and reading responses

Neo4J REST Service returns JSON responses, so we need a way to parse the JSON responses in .NET. You can write your own parser or use a opensource project like Newtonsoft JSON. To proceed faster, I will use this assembly. Calling the Neo4J API is straight forward, you can use  System.Net.HttpWebRequest and System.Net.HttpWebResponse classes.

Below is a very simple helper class to communicate with Neo4J REST API :

Imports System.Net
Imports System.IO
Imports Newtonsoft.Json.Linq
Imports System.Text
Imports System.Collections.ObjectModel

Public Class neo4j
    Public ServerName As String

    Public Sub New (ByVal pServerName As String)
        ServerName = pServerName
    End Sub

    Private Function SendRequest (ByVal url As String, ByVal method As String, ByVal body As String) As WebResponse
        Dim request As HttpWebRequest = WebRequest.Create (ServerName & url)
        request.Method = method
        request.ContentType = "application/json"
        If (Not String.IsNullOrEmpty (body)) Then
            Dim bytes = Encoding.UTF8.GetBytes (body)
            Using stream As Stream = request.GetRequestStream()
                stream.Write (bytes, 0, bytes.Length)
            End Using
        End If
        Try
            Return request.GetResponse()
        Catch ex As Exception
            Return Nothing
        End Try
    End Function

    Private Function HandleResponse (ByVal response As WebResponse) As String
        Using (response)
            Using stream As Stream = response.GetResponseStream()
                If response.ContentLength <= 0 Then Return String.Empty
                Dim length As Integer = CInt (response.ContentLength)
                Dim Buffer(length) As Byte
                stream.Read (Buffer, 0, length)
                Return Encoding.UTF8.GetString (Buffer)
            End Using
        End Using
    End Function

#Region " Parsers "

    Private Function ParseToken (ByVal token As JToken) As Object
        Select Case token.Type
            Case JTokenType.[Object]
                Return ParseObject (DirectCast (token, JObject))
            Case JTokenType.Array
                Return ParseArray (token)
            Case JTokenType.[Integer]
                Return CInt (token)
            Case JTokenType.Float
                Return CDbl (token)
            Case JTokenType.[String]
                Return CStr (token)
            Case JTokenType.[Boolean]
                Return CBool (token)
            Case JTokenType.[Date]
                Return CDate (token)
            Case JTokenType.Raw
                Return token.Value (Of Byte()) (0)
            Case JTokenType.Bytes
                Return token.Value (Of Byte()) (0)
            Case JTokenType.Null
                Return Nothing
            Case Else
                Throw New ArgumentOutOfRangeException()
        End Select

    End Function

    Private Function ParseArray (ByVal array As JToken) As ICollection(Of Object)
        Dim collection = New Collection(Of Object)
        For Each child As JToken In array.Children()
            collection.Add (ParseToken (child))
        Next
        Return collection
    End Function

    Public Function ParseObject (ByVal obj As JObject) As IDictionary(Of String, Object)
        Dim dictionary = New Dictionary(Of String, Object)
        For Each child As JProperty In obj.Properties()
            ParseProperty (child, dictionary)
        Next
        Return dictionary
    End Function

    Private Sub ParseProperty (ByVal prop As JProperty, ByVal dictionary As IDictionary(Of String, Object))
        dictionary.Add (prop.Name, ParseToken (prop.Value))
    End Sub

#End Region

End Class

Now, let us prepare a simple Person class to store our Neo4J responses :

Public Class person
    Public neoID As Long = 0
    Public bspID As Long = 0
    Public desc As String = ""
    Public scope As String = ""
    Public distance As Integer = 0
    Public reltype As String = ""
    Public reldir As String = ""
    Public reldesc As String = ""
    Public rels As List(Of person) = New List(Of person)
End Class

Now it is time to make our first query against Neo4J.
Following Curl command would return us all the nodes after the traversal of node 1:

curl -H Accept:application/json -H Content-Type:application/json -X POST -d "{\"order\":\"BREADTH FIRST\",\"max depth\":9999999999999}" http://localhost:7474/db/data/node/1/traverse/node

Now let us simulate the same command in a .NET function :

    Public Function Traverse(ByVal neoID As Long) As List(Of person)
        Dim person_arr As List(Of person) = New List(Of person)
        Dim url As String = "/db/data/node/" & neoID & "/traverse/node"
        Dim vbody As String = "{""order"":""BREADTH FIRST"",""max depth"":9999999999}"
        Dim wr As WebResponse = SendRequest(url, "POST", vbody)
        If Not wr Is Nothing Then
            Dim resp As String = HandleResponse(wr)
            Dim oarr As JArray = JArray.Parse(resp)
            For i As Integer = 0 To oarr.Count - 1
                Dim obj As IDictionary(Of String, Object) = ParseObject(oarr(i))
                Dim p As person = New person
                Dim pidarr() As String = obj("self").split("/")
                p.neoID = pidarr(pidarr.Length - 1)
                Try
                    p.bspID = obj("data")("bspID")
                Catch ex As Exception
                End Try
                Try
                    p.desc = obj("data")("desc")
                Catch ex As Exception
                End Try
                Try
                    p.scope = obj("data")("scope")
                Catch ex As Exception
                End Try
                person_arr.Add(p)
                p = Nothing
            Next
        End If
        wr = Nothing
        Return person_arr
    End Function

Voila, we now have a resulting list of person class. We can play with it as we like.

Making use of our Neo4J class to create CLR assembly

Let us create a new class called neo4jconn,

Imports Microsoft.SqlServer.Server
Imports System.Runtime.InteropServices

Partial Public Class neo4jconn

    <SqlFunction(DataAccess:=DataAccessKind.Read, FillRowMethodName:="FillRow", TableDefinition:="bspID int, neoID int, distance int, scope nvarchar(4000), desc nvarchar(4000)")> _
    Public Shared Function Traverse(ByVal bspID As SqlTypes.SqlInt32) As IEnumerable
        'Public Shared Function Traverse(ByVal bspID As Long, Optional ByVal depth As Long = 9999999999) As IEnumerable
        Dim neo As neo4j = New neo4j("http://localhost:7474")
        Dim mainp As person = neo.GetNodeByKey("people", "bspID", bspID)
        Dim plist As List(Of person) = Nothing
        If Not mainp Is Nothing Then
            plist = neo.Traverse(mainp.neoID)
        End If
        mainp = Nothing
        neo = Nothing
        Return plist
    End Function

    Public Shared Sub FillRow(ByVal obj As Object, _
                <Out()> ByRef bspID As SqlTypes.SqlInt32, <Out()> ByRef neoID As SqlTypes.SqlInt32, _
                <Out()> ByRef distance As SqlTypes.SqlInt32, _
                <Out()> ByRef scope As SqlTypes.SqlChars, <Out()> ByRef desc As SqlTypes.SqlChars)
        Dim p As person = CType(obj, person)
        bspID = New SqlTypes.SqlInt32(p.bspID)
        neoID = New SqlTypes.SqlInt32(p.neoID)
        distance = New SqlTypes.SqlInt32(p.distance)
        scope = New SqlTypes.SqlChars(p.scope)
        desc = New SqlTypes.SqlChars(p.desc)
    End Sub

End Class

Compile the project and you will have an assembly that you can use in SQL Server. We will now use some TSQL commands to register our library:

USE MY_DB
GO

EXEC sp_configure 'clr enabled', 1;
RECONFIGURE WITH OVERRIDE;
GO

ALTER DATABASE MY_DB SET Trustworthy ON;
GO

CREATE ASSEMBLY SMDiagnostics
from 'C:\Windows\Microsoft.NET\Framework\v3.0\Windows Communication Foundation\SMDiagnostics.dll'
with permission_set = UNSAFE
GO

CREATE ASSEMBLY [System.Runtime.Serialization]
from 'C:\Windows\Microsoft.NET\Framework\v3.0\Windows Communication Foundation\System.Runtime.Serialization.dll'
with permission_set = UNSAFE
GO

CREATE ASSEMBLY neo4j
FROM 'C:\BBSPdemo\Projects\bspneo4j\bin\Release\neo4j.dll'
WITH PERMISSION_SET = UNSAFE;
GO

Let us now create the SQL Function which will call our .Net function :

CREATE FUNCTION neo4j_traverse (@bspid int)
RETURNS TABLE (bspID int, neoID int, distance int, scope nvarchar(4000), desc nvarchar(4000))
AS EXTERNAL NAME neo4j.[neo4j.neo4jconn].Traverse

Conclusion

select * from dbo.neo4j_traverse(1) neo, Users u where neo.bspID=u.ID

Accessing external data directly from within SQL Server may cause some performance problems, so the final step of our evaluation would be to compare CLR performance against CURL calls. But at least, for the moment, we know that it is possible to access Neo4J from within SQL Server.

Comments, suggestions and questions are welcome.

Enjoy !

Advertisements


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s