sudo add-apt-repository -y ppa:linuxgndu/sqlitebrowser
sudo apt-get update
sudo apt-get install sqlitebrowser
A tree is a hierarchical data structure that is often traversed using recursion. For example:
node
object has a name (node.name
)node
object has zero or more other such nodes as children (node.children
)# Start at the root
depth_first(root)
# Function definition. This calls
# itself for every child of the input node
def depth_first(node):
# pre-order
print 'PRE: ', node.name
# recurse further
for child in node.children:
depth_first(child)
# post-order
print 'POST: ', node.name
A depth-first traversal explores as deeply as possible before back tracking, and so the result of the pre- and post-order actions would be (with this script and this input file):
$ perl recursion.pl
PRE: n4
PRE: n3
PRE: n1
PRE: A
PRE: B
POST: B
POST: A
PRE: n2
PRE: C
PRE: D
POST: D
POST: C
POST: n2
POST: n1
PRE: E
POST: E
POST: n3
POST: n4
A breadth-first algorithm:
node
object knows about its next sibling (node.sibling
)node
object knows about its first child (node.first_child
)def breadth_first(node):
print 'PRE: ', node.name
# first explore siblings recursively
if node.sibling:
breadth_first(node.sibling)
# then move deeper
if node.first_child:
breadth_first(node.first_child)
print 'POST: ', node.name
Would produce the following:
PRE: n4
PRE: n3
PRE: E
POST: E
PRE: n1
PRE: n2
PRE: C
PRE: D
POST: D
POST: C
POST: n2
PRE: A
PRE: B
POST: B
POST: A
POST: n1
POST: n3
POST: n4
For a depth-first traversal, all the children for the focal node need to be reachable, so
a simple table design (table node
in tree.db) might be:
id | name | parent |
---|---|---|
1 | A | 5 |
2 | B | 5 |
3 | C | 6 |
4 | D | 6 |
5 | n1 | 7 |
6 | n2 | 7 |
7 | n3 | 9 |
8 | E | 9 |
9 | n4 |
This way, for any focal node, its parent, children and siblings can be queried in
SQL, e.g. given n3
, retrieve the parent:
select p.name from node as n, node as p where n.name = 'n3' and n.parent = p.id
The children:
select c.name from node as n, node as c where n.name = 'n3' and n.id = c.parent
Sibling (but not self):
select s.name from node as n, node as s where n.name = 'n3' and n.parent = s.parent and s.name != 'n3'
SQL queries quickly become cumbersome, for example when attempting to formulate recursive traversals. However, there are some shortcuts to implement common queries, for example:
These can be implemented using additional, pre-computed columns. Here, left
is an
integer that was incremented and assigned to the focal node in pre-order during a
depth-first traversal, right
was incremented and assigned post-order:
id | parent | left | right | name |
---|---|---|---|---|
2 | 1 | 1 | 13 | n4 |
3 | 2 | 2 | 11 | n3 |
4 | 3 | 3 | 6 | n1 |
5 | 4 | 4 | 4 | A |
6 | 4 | 5 | 5 | B |
7 | 3 | 7 | 10 | n2 |
8 | 7 | 8 | 8 | C |
9 | 7 | 9 | 9 | D |
10 | 2 | 12 | 12 | E |
Selecting the descendents of n3
:
select d.name from node as d, node as n where n.name = 'n3' and d.left > n.left and d.right < n.right;
Selecting the ancestor(s) of n3
:
select a.name from node as a, node as n where n.name = 'n3' and a.left < n.left and a.right > n.right;
Selecting the MRCA of A
and C
:
select mrca.name
from
node as mrca,
node as a,
node as c
where
a.name='A' and
c.name='C' and
mrca.left < a.left and
mrca.right > c.right
limit 1;
We are going to figure out which of our crop species are most distant from one another. This means that, in principe, we have to inspect all pairs - so work together in your group.
sqlite3
is unavailableheight
column (distance to root) you should be able to fetch the
distance between a pair.