Add a unicode data test
This just dumps out our Unicode data for given input, and can compare the results to expected values.
This has been useful to me for some quick inspection of Unicode data.
So, just preserving it here for posterity. Feel free to merge if you think it could be useful to others.