Sunday, October 26, 2014

Finding Shortest Prefixes for Strings in an Array [Java]

Question:

 Use the shortest unique prefix to represent each word in the array

 input:    ["zebra", "dog", "duck", "dot"]
 output: {zebra: z, dog: do, duck: du, dot: d}

 input:   [zebra, dog, duck, dove]
 output: {zebra: z, dog: do, duck: du, dove: d}

 input:   [bearcat, bear]
 output: {bearcat: be, bear: b}

Do check your program's output for this input:

input: [bbbb, bbb, bb, b]

and this one

input: [b, bb, bbb, bbbb]

Algorithm I came up with:

[We are using two arrays, the one given and the one we'll be using for the prefixes]

1. Make another array for prefixes. Store first characters of original strings in this array.
2. For all the prefixes to the left of it (in the prefix array), check whether this prefix has been used somewhere.
3. If it has been used, check whether you can add on character to the previous duplicate prefix, if not, add one character to the one that is being checked for duplicates.
4. Follow the same procedure for this new updated prefix (whether it was at some previous location in the array or the current one that was being checked, which is now updated)[Recursion]

Source:

import java.util.Arrays; 

public class UniquePrefix { 
 
    public static void checkPrefix(String[] strings, String[] pre, String s, int index){
        
        //System.out.println(Arrays.toString(pre));
        for(int i=index-1;i>=0;--i){
            if(s.matches(pre[i])){ 

                if(s.length()==strings[i].length()){
                    //System.out.println("Can't update the previous one, need to update this one");
                    pre[index] = strings[index].substring(0, s.length()+1);
                    checkPrefix(strings, pre, pre[index], index);
                    return;
                }
                else if(s.length() < strings[i].length()){ 
                    //System.out.println("Can update the previous one");
                    pre[i] = strings[i].substring(0, s.length()+1);
                    checkPrefix(strings, pre, pre[i], i);
                    return;
                } 
            }
        }
    }
    
    public static void findPrefixes(String[] strings){
        System.out.println();
        String[] pre = new String[strings.length];
         
        for(int i=0;i<pre.length;++i){
            if(i>0){
                if(strings[i].matches(strings[i-1])){
                    System.out.println("Duplicate string - Error!"); continue; 
                }
            }
            pre[i]=Character.toString(strings[i].charAt(0)); 
            checkPrefix(strings, pre, pre[i], i);
        } 
        System.out.println(Arrays.toString(strings));
        System.out.println(Arrays.toString(pre));
    }
    
    public static void main(String[] args) {

        //Tests
        String[] strings = {"zebra", "dog", "duck", "dove"};
        findPrefixes(strings);

        strings = new String[]{"zebra", "dog", "duck", "dot"};
        findPrefixes(strings);
        
        strings = new String[]{"zebra", "dog", "duck", "dove", "dorm"};
        findPrefixes(strings);

        strings = new String[]{"bearcat", "bear"};
        findPrefixes(strings);

        strings = new String[]{"bearcat", "bearcat"}; 
        findPrefixes(strings);
        strings = new String[]{"bearcat", "bear", "be", "b"}; 
        findPrefixes(strings);

        strings = new String[]{"bear", "dog", "bea", "bearcat"}; 
        findPrefixes(strings);
        
        strings = new String[]{"b", "bb"};
        findPrefixes(strings);
        
        //Killer tests
        strings = new String[]{"b", "bb", "bbb", "bbbb"}; 
        findPrefixes(strings);
        strings = new String[]{"bbbb", "bbb", "bb", "b"};
        findPrefixes(strings);  
    }
}

Output:

[zebra, dog, duck, dove]
[z, do, du, d]

[zebra, dog, duck, dot]
[z, do, du, d]

[zebra, dog, duck, dove, dorm]
[z, dog, du, do, d]

[bearcat, bear]
[be, b]

Duplicate string - Error!
[bearcat, bearcat]
[b, null]

[bearcat, bear, be, b]
[bear, bea, be, b]

[bear, dog, bea, bearcat]
[bea, d, be, b]

[b, bb]
[b, bb]

[b, bb, bbb, bbbb]
[b, bb, bbb, bbbb]

[bbbb, bbb, bb, b]
[bbbb, bbb, bb, b]

1 comment:

  1. Have you ever heard about prefix trees?

    ReplyDelete