Potential replacements for hand-rolled (?) C source scanner
I remember @ebassi stating it would be a good idea to do so, and from looking at the code, in particular the macro parsing stage, I would tend to agree.
Looking around for options, I think https://github.com/ned14/pcpp could potentially be a good fit for the preprocessing stage / macro extraction, and https://github.com/eliben/pycparser/tree/master/pycparser for the C parsing stage.
- pure python modules
- based on https://github.com/dabeaz/ply
- able to provide precise source code positions
pcpp can be interfaced with via
PreprocessorHooks, in particular
on_comment. It also gets right the one case I grew annoyed with the current scanner for:
#define NUMBERS 1, \ 2, \ 3
The current scanner won't do well with multiline macros in general, but in the case above, or even with
#define NUMBERS 1, 2, 3 really, it will actually end up advertising a constant named
NUMBERS with a value of 1 in the gir. I'm sure this could be special-cased, but afaict the current code / approach is much too naive in general.
pycparser is interacted with through an AST obviously.
Re. licensing, both modules and
ply are BSD-3-clause.
The obvious question is distribution.
pycparser is packaged in Fedora as far as I can tell, but
pcpp isn't. They both are available on
pypi of course.