Add column support in c_parser #178

serpilliere · 2017-03-09T17:35:47Z

Hi!

This PR adds the support of coord.column in the ast parser.

It's inspired from the #92, by @bbejot. The main difference is that it only
fixes the missing column: it doesn't not add the end_lineno nor the
end_column.

Note: As the _coord function is not really used any more (only two calls remaining),
tell me if you prefer removing it in favour of ._token_coord.

eliben

Thanks for the PR!

A general question: did you measure the performance impact of this change?

eliben · 2017-03-10T04:21:10Z

pycparser/plyparser.py

@@ -51,6 +51,13 @@ def _coord(self, lineno, column=None):
                line=lineno,
                column=column)

+    def _token_coord(self, p, token_idx):


this function could use a docstring, as well as comments inside

You are right! I will add it.

eliben · 2017-03-10T04:21:38Z

tests/test_c_parser.py

        self.assertEqual(node.coord.line, line)
+        if column:


what if column == 0?

In fact, the column implemented keeps the semantic of lex, so the column always starts at 1, not 0.
Do you want me to test against None (plus an assert against 0)?

By the way, do you agree to keep the semantic of lex (starting column AND lineno at 1) or do you want a modification to starts column at 0?

Yes, a default value of None and a test against None should be better, I think

Starting column at 1 is fine

eliben · 2017-03-10T04:23:14Z

tests/test_c_parser.py

-        self.assert_coord(f1.ext[0], 2, 'test.c')
-        self.assert_coord(f1.ext[1], 3, 'test.c')
-        self.assert_coord(f1.ext[2], 6, 'test.c')
+        self.assert_coord(f1.ext[0], 2, 13, 'test.c')


It's not very intuitive that this is column 13... How about wrapping parse to strip leading whitespace as long as the first line or something like that, to get a more expectedoffset?

in fact, as the column starts at 1, there is no spaces counted in tokens. the 13 starts on the a:

TypeDecl: a, [] IdentifierType: ['int']

Again, if you don't agree with the lex column start, don't hesitate to tell me and I will modify the behavior.

I used an (behemoth.h from https://github.com/snare/ida-efiutils/blob/master/behemoth.h) input file of 21945 lines (preprocessed). cProfile For the c_parse.py:parse:

without patch: 1.774s/1.784s/1.801s
with patch: 1.847s/1.857s/1.891s

So about 4% of speed lost

As the function is often called in this case (about 31000), I suspect a little overhead of cprofile.

Raw value on the whole python script:
without patch: 1.997s/2.030s/2.093s
with patch: 2.040s/2.053s/2.090s

(Note the max time of both runs: maybe the timing measures are not really correct here)

I meant all the leading space, not 0-1 starting. Without columns it wasn't bothersome but now it's hard to see why this should be 13 and not some close-to-that number because everything is indented. To keep this patch small, I'm OK with fixing this somehow in a follow-up

serpilliere · 2017-03-10T13:31:35Z

add is None for column
comment for _token_coord

I also modified the regression tests in which the file name was not present to comply with the fixed assert_coord api.

eliben · 2017-03-10T14:07:40Z

Thanks, looks good

eliben reviewed Mar 10, 2017

View reviewed changes

Add column support in c_parser

239fe23

serpilliere force-pushed the Fix_coord branch from c09fd8c to 239fe23 Compare March 10, 2017 13:28

eliben merged commit 471442f into eliben:master Mar 10, 2017

serpilliere deleted the Fix_coord branch March 10, 2017 14:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add column support in c_parser #178

Add column support in c_parser #178

serpilliere commented Mar 9, 2017

eliben left a comment

eliben Mar 10, 2017

serpilliere Mar 10, 2017

eliben Mar 10, 2017

serpilliere Mar 10, 2017

eliben Mar 10, 2017

eliben Mar 10, 2017

serpilliere Mar 10, 2017

serpilliere Mar 10, 2017

eliben Mar 10, 2017

serpilliere commented Mar 10, 2017

eliben commented Mar 10, 2017

Add column support in c_parser #178

Add column support in c_parser #178

Conversation

serpilliere commented Mar 9, 2017

eliben left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

serpilliere commented Mar 10, 2017

eliben commented Mar 10, 2017