Wednesday, September 16, 2020

A Trie GO implementation

 


The image is taken from the Wikipedia



I've recently implemented a trie implementation in GO. While doing so I've used some of the best practices that I've presented at the post: GO Access Control Best Practice, and I thought it would be nice to review the code based on the best practices.


The trie is based on a node struct, which represents a tree root and a tree root.


package trie

import (
"fmt"
"sort"
"strings"
)


type Node struct {
value string
count int
children map[string]*Node
}

func Produce(value string) *Node {
return &Node{
value: value,
count: 0,
children: make(map[string]*Node),
}
}



Notice the Produce function hides the internals of initializing the struct, which in this case includes creation of a map.

To build the trie, we add words to it. In this case the trie is built for sentences, and not for words, so each node's value is a string and not a character. Hence the Add method is as follows:


func (n *Node) Add(values []string) {
if len(values) == 0 {
n.count++
return
}

childName := values[0]
child, exists := n.children[childName]
if !exists {
child = Produce(childName)
n.children[childName] = child
}

child.Add(values[1:])
}



The Add method recursively calls itself until it add all the nodes to hold the sentence.


The last missing piece is the tree traverse, which as expected from a proper code, does not expose the internals of the implementation.


type Scanner func(path []string, count int)

func (n *Node) TraverseDepthFirst(scanner Scanner) {
n.traverseDepthFirst(scanner, []string{})
}

func (n *Node) traverseDepthFirst(scanner Scanner, path []string) {
newPath := make([]string, len(path)+1)
for i, value := range path {
newPath[i] = value
}
newPath[len(path)] = n.value

sorted := n.sortedNodes()
for _, name := range sorted {
child := n.children[name]
child.traverseDepthFirst(scanner, newPath)
}

scanner(newPath, n.count)
}

func (n *Node) sortedNodes() []string {
sorted := make([]string, 0)

for name := range n.children {
sorted = append(sorted, name)
}
sort.Strings(sorted)
return sorted
}



The trie traverse method uses the scanner design pattern, which is run in a depth first scan methodology. This allows, once again, to avoid coupling between the trie internal implementation, and the trie using components.


An example of usage is:


package trie

import (
"fmt"
"testing"
)

func Test(t *testing.T) {
root := Produce("")
root.Add([]string{"a", "dog"})
root.Add([]string{"a", "dog", "is", "barking"})
root.Add([]string{"this", "is", "a", "cat"})
root.Add([]string{"this", "is", "a", "dog"})
root.Add([]string{"this", "is", "a", "dog"})
root.TraverseDepthFirst(func(path []string, count int) {
fmt.Println(path, count)
})
}



And the output of the trie traverse is:


[ a dog is barking] 1
[ a dog is] 0
[ a dog] 1
[ a] 0
[ this is a cat] 1
[ this is a dog] 2
[ this is a] 0
[ this is] 0
[ this] 0
[] 0



No comments:

Post a Comment