Sunday, March 22, 2020

Writing Java Again

And finding the sort of puzzles that kept me gainfully employed for so many years. Trying to tokenize a TAB-delimited line exported from Excel, StringTokenize almost does what I want.  But empty cells produce two tabs in a row, and it seems StringTokenize ignores two tabs in a row, treating them as one tab.  Mysteries.

Definitely works in mysterious ways.  \
 
t\tThird\tFourth

keeps producing tokens with a single tab character out to infinity (or at least, index out of range).

2 comments:

  1. StringTokenizer has been deprecated. Instead, try using string.split(). It will return an array of Strings, with empty elements for missing cells. Observe:

    public class splitter {
    public static void main(String[] args) {
    String[] arr;
    arr = "first\t\t\tthird\tfourth".split("\t");
    for (String s : arr) {
    System.out.println("[" + s + "]");
    }
    }
    }

    This prints
    [first]
    []
    []
    [third]
    [fourth]
    which represents a 5-element array, where the 2nd and 3rd are empty. (Note if you will have trailing empty elements you need a different overload of the split function, with a negative number as the second parameter.)

    ReplyDelete
  2. Rick: Thanks! Much simpler and it works.

    ReplyDelete