Sunday, March 22, 2020

Writing Java Again

And finding the sort of puzzles that kept me gainfully employed for so many years. Trying to tokenize a TAB-delimited line exported from Excel, StringTokenize almost does what I want.  But empty cells produce two tabs in a row, and it seems StringTokenize ignores two tabs in a row, treating them as one tab.  Mysteries.

Definitely works in mysterious ways.  \
 
t\tThird\tFourth

keeps producing tokens with a single tab character out to infinity (or at least, index out of range).

2 comments:

Rick C said...

StringTokenizer has been deprecated. Instead, try using string.split(). It will return an array of Strings, with empty elements for missing cells. Observe:

public class splitter {
public static void main(String[] args) {
String[] arr;
arr = "first\t\t\tthird\tfourth".split("\t");
for (String s : arr) {
System.out.println("[" + s + "]");
}
}
}

This prints
[first]
[]
[]
[third]
[fourth]
which represents a 5-element array, where the 2nd and 3rd are empty. (Note if you will have trailing empty elements you need a different overload of the split function, with a negative number as the second parameter.)

Clayton Cramer said...

Rick: Thanks! Much simpler and it works.