hyperlink: tolerate stream read/write errors
@egmontkob
Submitted by Egmont Koblinger Link to original bug (#781797)
Description
VTE is supposed to tolerate read failures from the streams (which might be caused by previous write failures, e.g. out of disk space). It should gracefully forget some scrollback lines, but not crash.
(Note: When encryption is turned off, we might receive invalid data (e.g. all zeros from a sparse block) and interpret it as if it was valid. This is quite bad, potentially a source of weird crashes, maybe we should protect against it by some simple checksumming. When encryption is enabled, it has checksumming embedded. That is, we either get the correct data, or we know that the read failed.)
Not sure if prior to the hyperlink feature this was actually implemented correctly. I might have broken it with the rewrap code for example. But I'm almost sure that the hyperlink code (bug 779734) broke it big time.
We should validate the hyperlink feature's behavior and make sure VTE cannot crash when reading from any of the streams fails.
attr_stream's current hyperlink format is not that friendly for this. Currently the hyperlink is stored in between records in just as many bytes as necessary, making subsequent records not necessarily aligned. That is, there's no way to resynchronize by looking at attr_stream. The offsets from row_stream remain the only synchronization points.
I have the following change to attr_stream's format in mind. Depending on how we address this bug, this refactoring may or may not be necessary, but is probably good to have anyways.
-
Add a padding to URLs so that all the VteStreamCellAttr records are aligned.
-
Enforce a certain set of allowed characters in URLs (and their IDs). We might enforce the 32..126 range, but I'd rather keep raw UTF-8 chars working in quickly hacked shell scripts. However, I'd be absolutely fine disallowing control characters, that is, enforcing the 32..255 range (actually 32..253 because of UTF-8's nature). (Note that the padding would also need to be chosen from these allowed bytes.)
-
In VteStreamCellAttr make a certain byte always have its top three bits unset. That is, the byte is always in the 0..31 range. This allows us to tell for each aligned record whether that's a real VteStreamCellAttr record, or a hyperlink fragment. (Actually, the current "hyperlink_length" field's high byte is such a byte.)
(It might be a "nicer" design to have a starting byte telling the type of the record: either a complete VteStreamCellAttr or a hyperlink fragment. In that case, however, URLs cannot be read/written directly, we'd need to split up the string.)
This construct allows us to resynchonize anywhere after a read failure.
Also note that this construct allows us to walk backwards, finding the previous actual VteStreamCellAttr record. So we'd no longer need to repeat the URL's length after the URL. (We could drop it from before the URL too, but that'd make things slower and more complicated, so let's keep it there.)
Again, this is not yet a solution to the actual bug, but probably a useful step towards that.
Version: git master