Sweeping Trees in Java


The challenge

You will be given a list of strings representing nodes in a rooted tree. A visual representation of a very simple rooted tree could be:

        +----+
        |root|
        ++--++
         |  |
    +----+  +----+
    v            v
+------+      +------+
|node A|      |node B|
+------+      +------+

In this case, the root node has two children, nodes A and B. A more complex example would be:

        +----+
        |root|
        ++--++
         |  |
    +----+  +----+
    v            v
+------+      +------+
|node A|      |node B|
+--+---+      +------+
   |
   +-------------+
   v             v
+------+      +------+
|Node C|      |node D|
+------+      +------+

Here, The root has two children (nodes A and B), and node A in turn also has two children (nodes C and D). Each of these node strings contains three comma-separated values in the given order:

  • a unique, non-zero length, alphanumeric node id
  • the id of its parent node
  • its status, which may either be “valid” or “invalid”

The root node of a tree always has an empty string as its parent id. The simple example from above would be represented by the following list (assuming all nodes have the status “valid”):

["root,,valid", "node A,root,valid", "node B,root,valid"]

The more complex example would result in the list

["root,,valid", "node A,root,invalid", "node B,root,valid", "node C,node A,valid", "node D,node A,valid"]

This example assumes that all nodes apart from A have the status “valid”.

Task

The goal of this challenge is to remove the subtrees of all invalid nodes from the tree and return the set of ids of the remaining tree nodes.

  • An invalid node is one whose status equals “invalid”.
  • “Removing the subtree” indicates that you should both remove the invalid nodes as well as all of their descendants (i.e. their children, the children of their children and so on).

The expected result set for the more complex example would hence contain the following strings in arbitrary order:

["root", "node B"]

Notes

The list always represents a valid tree. This means that

  • there is only one root node in the list
  • the parent id of each node will correspond to the id of another node in the list (apart from the root node, of course, whose parent id is an empty string)
  • there are no cycles in the tree

The nodes in the list may be in any order, irrespectively of their parent/child relations.

The lists may contain a large number of nodes. If any test fails because of a timeout, this likely means that while your solution is formally correct, it is too inefficient.

The solution in Java code

Option 1:

import java.util.*;

public class SweepingTrees {

    public Set<String> determineValidIds(List<String> dirtyTree) {
        Map<String,List<String>> nodes = new HashMap<>();
        dirtyTree.stream()
                 .filter( s -> !s.endsWith(",invalid"))
                 .map(s -> s.split(","))
                 .forEach(sArr -> nodes.computeIfAbsent(sArr[1], k -> new ArrayList<String>())
                                       .add(sArr[0]) );
        
        Set<String> out = new HashSet<>();
        Stack<String> stk = new Stack<>();
        stk.add("");
        while (!stk.isEmpty()) {
            List<String> subNodes = nodes.get(stk.pop());
            if (subNodes==null) continue;
            out.addAll(subNodes);
            stk.addAll(subNodes);
        }
        return out;
    }
}

Option 2:

import java.util.*;
import static java.util.stream.Collectors.toMap;

public class SweepingTrees {

  public Set<String> determineValidIds(List<String> dirtyTree) {
        Set<String> result = new LinkedHashSet<>();
 
        if (!dirtyTree.contains("root,,valid")) return result;
    
        Map<String, String> treeMap = new HashMap<>();
        for (String s : dirtyTree) {
            String[] itemSplit = s.split(",");
            if (itemSplit[2].equals("valid")) {
                treeMap.put(itemSplit[0], itemSplit[1]);
            }
        }

        for (Map.Entry<String, String> entry : treeMap.entrySet()) {
            String key = entry.getKey();
            String value = entry.getValue();
            if (value.equals("")) {
                result.add(key);
            }else if (treeMap.containsKey(value)) {
                if (checkKey(value, treeMap)) result.add(key);
            }
        }
        return result;
    }

    public static boolean checkKey(String string, Map<String, String> map) {
        if (string.equals("root")) return true;
        if (map.containsKey(string)) {
            if (checkKey(map.get(string), map)) return true;
        }
        return false;
    }
  
}

Option 3:

import java.util.List;
import java.util.Set;
import java.util.HashSet;

public class SweepingTrees {

  public Set<String> determineValidIds(List<String> dirtyTree) {
    Set<String> valid = new HashSet<>(); 
    Set<String> invalid = new HashSet<>(); 
    
    for(String node : dirtyTree) {
      if(node.substring(node.lastIndexOf(",") + 1).equals("invalid") && 
        node.substring(0, node.indexOf(",")).equals("root"))
          return new HashSet<>();
      if(invalid.contains(node.substring(node.indexOf(",") + 1, node.lastIndexOf(","))) || 
        node.substring(node.lastIndexOf(",") + 1).equals("invalid")) {
        invalid.add(node.substring(0, node.indexOf(",")));
      } else {
        valid.add(node.substring(0, node.indexOf(",")));
      }
    }
    
    return valid;
  }
}

Test cases to validate our solution

import static org.junit.Assert.assertEquals;

import java.util.Arrays;
import java.util.HashSet;
import java.util.List;
import java.util.Set;

import org.junit.Test;

public class SweepingTreesExampleTest {
	
	@Test
	public void test1() {
		List<String> tree = Arrays.asList("root,,valid", "invalid child,root,invalid", "valid child,root,valid");
		Set<String> expectedIds = new HashSet<>(Arrays.asList("root", "valid child"));
		
		Set<String> cleanedIds = new SweepingTrees().determineValidIds(tree);
		
		assertEquals(expectedIds, cleanedIds);
	}
	
	@Test
	public void test2() {
		List<String> tree = Arrays.asList("root,,valid", "invalid parent,root,invalid", "valid child,invalid parent,valid", "valid child2,invalid parent,valid");
		Set<String> expected = new HashSet<>(Arrays.asList("root"));
		
		Set<String> cleanedTree = new SweepingTrees().determineValidIds(tree);
		
		assertEquals(expected, cleanedTree);
	}
	
	@Test
	public void test3() {
		List<String> tree = Arrays.asList("root,,valid", "invalid parent,root,invalid", "valid child,invalid parent,valid", "valid child2,invalid parent,valid", "valid child3,invalid parent,valid", "valid child3,invalid parent,valid", "valid child4,valid child,valid", "valid parent,root,valid");
		Set<String> expected = new HashSet<>(Arrays.asList("root", "valid parent"));
		
		Set<String> cleanedTree = new SweepingTrees().determineValidIds(tree);
		
		assertEquals("Did not delete all children of invalid entries", expected, cleanedTree);
	}
	
}