incorrect read of .csv with trailing spaces
Expected Behavior
Consider the attached very simple .gnumeric file. Cells A1 and B1 contain strings with trailing spaces. Each string has length 6, as calculated in C1.
Now ssconvert that to a .csv file. The result contains, in its entirety:
"x ","x ",6
exactly as expected. So far so good.
Open the .csv file using gnumeric. I would expect A1 and B1 to contain the same strings I started with. This is particularly easy to observe if you select-all and right-justify. If you replace the numerical value 6 with the formula =len(A1) I would expect the calculated length to be 6.
Observed Behavior
I observe that the trailing spaces are lost. The strings now have length 1. This is 100% reproducible chez moi. This is inconsistent with the .csv RFC, notably section 2 point 4 (text).
Here's another way, possibly more convenient, to observe the same problem:
:; ssconvert -T Gnumeric_XmlIO:sax:0 spaces.csv fd://1 | grep -w gnm:Cell
<gnm:Cell Row="0" Col="0" ValueType="60">x</gnm:Cell>
<gnm:Cell Row="0" Col="1" ValueType="60">x</gnm:Cell>
<gnm:Cell Row="0" Col="2" ValueType="40">6</gnm:Cell>
Semi-Workarounds
There exist workarounds using mid()
or substitute()
but they are painful.
Platform
This is not a new bug. It is observed with the standard version shipped with ubuntu focal:
gnumeric version '1.12.38'
datadir := '/usr/local/share/gnumeric/1.12.38'
libdir := '/usr/local/lib/gnumeric/1.12.38'
It is also observed with a very recent version compiled from git sources:
:; uname -srmo
Linux 5.7.1+ x86_64 GNU/Linux
:; lsb_release -a
LSB Version: core-11.1.0ubuntu2-noarch:cxx-3.0-amd64:cxx-3.0-noarch:cxx-3.1-amd64:cxx-3.1-noarch:cxx-3.2-amd64:cxx-3.2-noarch:cxx-4.0-amd64:cxx-4.0-noarch:cxx-4.1-amd64:cxx-4.1-noarch:graphics-2.0-amd64:graphics-2.0-noarch:graphics-3.0-amd64:graphics-3.0-noarch:graphics-3.1-amd64:graphics-3.1-noarch:graphics-3.2-amd64:graphics-3.2-noarch:graphics-4.0-amd64:graphics-4.0-noarch:graphics-4.1-amd64:graphics-4.1-noarch:multimedia-3.2-amd64:multimedia-3.2-noarch:multimedia-4.0-amd64:multimedia-4.0-noarch:multimedia-4.1-amd64:multimedia-4.1-noarch:printing-11.1.0ubuntu2-noarch:security-11.1.0ubuntu2-noarch
Distributor ID: Ubuntu
Description: Ubuntu 20.04.1 LTS
Release: 20.04
Codename: focal
:; /usr/src/gnome/gnumeric/src/gnumeric --version
gnumeric version '1.12.49'
datadir := '/usr/src/gnome/install/share/gnumeric/1.12.49'
libdir := '/usr/src/gnome/gnumeric'
:; git log
commit 1c65e7bf124a8a2ddf5047280af3ab0e928b9d51
Date: Sat Sep 26 13:35:17 2020 -0400