LaTeX commands are not parsed correctly
Consider the following LaTeX file:
\documentclass{minimal}
\begin{document}
The three following lines do the same thing, using the textbf command:
\textbf33 \textbf**
\textbf{3}3 \textbf **
\textbf 33 \textbf{*}*
This space command\ is valid in \LaTeX.
So is this newline\
command and so is this tabular\ command as written in the \TeX book, chapter 3.
\end{document}
You can compile it without any error using pdflatex
. As expected from Knuth's TeXbook, chapter 3, the three lines do the exact same thing (they print: 33 **).
Moreover, the commands \
, \<tab>
and \<newline>
are all valid. Actually, from the TeXbook, a command is either \
followed by letters, or \
followed by only one non-letter character.
It is also standard to change the catcode of @ to allow it inside commands, this is mainly done inside of classes and packages files, but the end user can activate this at any moment in LaTeX file using \makeatletter
. This is a hack which can be done with any character, but it is a standard trick with the character @. One can look for instance at the standard article class file article.cls
(which can be found using kpsewhich article.cls
)
Expected behavior: GTKSourceView understands that \textbf33
, \textbf**
are the command \textbf
followed by arguments, and only highlights the command, and also highlights the commands \
, \<tab>
and \<newline>
. Also, every command using @ in article.cls
should be highlighted.
Bug: GTKSourceView highlights all of \textbf33
, \textbf**
, and none of \
, \<tab>
and \<newline>
. In article.cls
, only commands starting with @ are highlighted, otherwise they stop being highlighted right in the middle, when another @ appears.
Tested on: A personal test binary using GTKSourceView current source code, but it can also be reproduced on Gedit 44.2 on Debian 12.4, which uses GTKSourceView 4.8.4-4, and also on Gedit 41.0 on Ubuntu 22.04.3 LTS, which uses GTKSourceView 4.8.3-1.
Comment: The main reason behind the bug is that @ is only allowed at the beginning of a command, and numbers are allowed in commands even though they should not. Also, * is allowed at the end of commands, even though starred commands are the exception, not the rule; so I believe they should be treated on a case-by-case basis instead. Moreover, one could argue that the * is an argument for the command, and not a part of the command itself.