Discussion:
Read C++ enum in python
Ludo
2009-08-18 23:03:30 UTC
Permalink
Hello,

I work in a very large project where we have C++ packages and pieces of
python code.

I've been googleing for days but what I find seems really too
complicated for what I want to do.

My business is, in python, to read enum definitions provided by the
header file of an c++ package.
Of course I could open the .h file, read the enum and transcode it by
hand into a .py file but the package is regularly updated and thus is
the enum.

My question is then simple : do we have :
- either a simple way in python to read the .h file, retrieve the c++
enum and provide an access to it in my python script
- either a simple tool (in a long-term it would be automatically run
when the c++ package is compiled) generating from the .h file a .py file
containing the python definition of the enums ?

Thank you for any suggestion.
MRAB
2009-08-18 23:50:05 UTC
Permalink
Post by Ludo
Hello,
I work in a very large project where we have C++ packages and pieces of
python code.
I've been googleing for days but what I find seems really too
complicated for what I want to do.
My business is, in python, to read enum definitions provided by the
header file of an c++ package.
Of course I could open the .h file, read the enum and transcode it by
hand into a .py file but the package is regularly updated and thus is
the enum.
- either a simple way in python to read the .h file, retrieve the
c++ enum and provide an access to it in my python script
- either a simple tool (in a long-term it would be automatically run
when the c++ package is compiled) generating from the .h file a .py file
containing the python definition of the enums ?
Thank you for any suggestion.
Speaking personally, I'd parse the .h file using a regular expression
(re module) and generate a .py file. Compilers typically have a way of
letting you run external scripts (eg batch files in Windows or, in this
case, a Python script) when an application is compiled.
Mark Tolonen
2009-08-19 07:28:15 UTC
Permalink
"MRAB" <python at mrabarnett.plus.com> wrote in message
Post by MRAB
Post by Ludo
Hello,
I work in a very large project where we have C++ packages and pieces of
python code.
I've been googleing for days but what I find seems really too complicated
for what I want to do.
My business is, in python, to read enum definitions provided by the
header file of an c++ package.
Of course I could open the .h file, read the enum and transcode it by
hand into a .py file but the package is regularly updated and thus is the
enum.
- either a simple way in python to read the .h file, retrieve the c++
enum and provide an access to it in my python script
- either a simple tool (in a long-term it would be automatically run
when the c++ package is compiled) generating from the .h file a .py file
containing the python definition of the enums ?
Thank you for any suggestion.
Speaking personally, I'd parse the .h file using a regular expression
(re module) and generate a .py file. Compilers typically have a way of
letting you run external scripts (eg batch files in Windows or, in this
case, a Python script) when an application is compiled.
This is what 3rd party library pyparsing is great for:

--------begin code----------
from pyparsing import *

# sample string with enums and other stuff
sample = '''
stuff before

enum hello {
Zero,
One,
Two,
Three,
Five=5,
Six,
Ten=10
}

in the middle

enum blah
{
alpha,
beta,
gamma = 10 ,
zeta = 50
}

at the end
'''

# syntax we don't want to see in the final parse tree
_lcurl = Suppress('{')
_rcurl = Suppress('}')
_equal = Suppress('=')
_comma = Suppress(',')
_enum = Suppress('enum')

identifier = Word(alphas,alphanums+'_')
integer = Word(nums)

enumValue = Group(identifier('name') + Optional(_equal + integer('value')))
enumList = Group(enumValue + ZeroOrMore(_comma + enumValue))
enum = _enum + identifier('enum') + _lcurl + enumList('list') + _rcurl

# find instances of enums ignoring other syntax
for item,start,stop in enum.scanString(sample):
id = 0
for entry in item.list:
if entry.value != '':
id = int(entry.value)
print '%s_%s = %d' % (item.enum.upper(),entry.name.upper(),id)
id += 1
--------------end code------------

Output:
HELLO_ZERO = 0
HELLO_ONE = 1
HELLO_TWO = 2
HELLO_THREE = 3
HELLO_FIVE = 5
HELLO_SIX = 6
HELLO_TEN = 10
BLAH_ALPHA = 0
BLAH_BETA = 1
BLAH_GAMMA = 10
BLAH_ZETA = 50

-Mark
Mark Tolonen
2009-08-20 04:01:23 UTC
Permalink
"Mark Tolonen" <metolone+gmane at gmail.com> wrote in message
news:h6g9ig$vh0$1 at ger.gmane.org...
[snip]
Post by Mark Tolonen
--------begin code----------
from pyparsing import *
# sample string with enums and other stuff
sample = '''
stuff before
enum hello {
Zero,
One,
Two,
Three,
Five=5,
Six,
Ten=10
}
in the middle
enum blah
{
alpha,
beta,
gamma = 10 ,
zeta = 50
}
at the end
'''
# syntax we don't want to see in the final parse tree
_lcurl = Suppress('{')
_rcurl = Suppress('}')
_equal = Suppress('=')
_comma = Suppress(',')
_enum = Suppress('enum')
identifier = Word(alphas,alphanums+'_')
integer = Word(nums)
enumValue = Group(identifier('name') + Optional(_equal +
integer('value')))
enumList = Group(enumValue + ZeroOrMore(_comma + enumValue))
enum = _enum + identifier('enum') + _lcurl + enumList('list') + _rcurl
# find instances of enums ignoring other syntax
id = 0
id = int(entry.value)
print '%s_%s = %d' % (item.enum.upper(),entry.name.upper(),id)
id += 1
--------------end code------------
HELLO_ZERO = 0
HELLO_ONE = 1
HELLO_TWO = 2
HELLO_THREE = 3
HELLO_FIVE = 5
HELLO_SIX = 6
HELLO_TEN = 10
BLAH_ALPHA = 0
BLAH_BETA = 1
BLAH_GAMMA = 10
BLAH_ZETA = 50
Paul McGuire (pyparsing author) reminded me that:

enum.ignore(cppStyleComment)

before scanString will skip commented out sections as well.

-Mark
Bill Davy
2009-08-19 09:34:11 UTC
Permalink
"Mark Tolonen" <metolone+gmane at gmail.com> wrote in message
Post by Mark Tolonen
"MRAB" <python at mrabarnett.plus.com> wrote in message
Post by MRAB
Post by Ludo
Hello,
I work in a very large project where we have C++ packages and pieces of
python code.
I've been googleing for days but what I find seems really too
complicated for what I want to do.
My business is, in python, to read enum definitions provided by the
header file of an c++ package.
Of course I could open the .h file, read the enum and transcode it by
hand into a .py file but the package is regularly updated and thus is
the enum.
- either a simple way in python to read the .h file, retrieve the
c++ enum and provide an access to it in my python script
- either a simple tool (in a long-term it would be automatically run
when the c++ package is compiled) generating from the .h file a .py file
containing the python definition of the enums ?
Thank you for any suggestion.
Speaking personally, I'd parse the .h file using a regular expression
(re module) and generate a .py file. Compilers typically have a way of
letting you run external scripts (eg batch files in Windows or, in this
case, a Python script) when an application is compiled.
--------begin code----------
from pyparsing import *
# sample string with enums and other stuff
sample = '''
stuff before
enum hello {
Zero,
One,
Two,
Three,
Five=5,
Six,
Ten=10
}
in the middle
enum blah
{
alpha,
beta,
gamma = 10 ,
zeta = 50
}
at the end
'''
# syntax we don't want to see in the final parse tree
_lcurl = Suppress('{')
_rcurl = Suppress('}')
_equal = Suppress('=')
_comma = Suppress(',')
_enum = Suppress('enum')
identifier = Word(alphas,alphanums+'_')
integer = Word(nums)
enumValue = Group(identifier('name') + Optional(_equal +
integer('value')))
enumList = Group(enumValue + ZeroOrMore(_comma + enumValue))
enum = _enum + identifier('enum') + _lcurl + enumList('list') + _rcurl
# find instances of enums ignoring other syntax
id = 0
id = int(entry.value)
print '%s_%s = %d' % (item.enum.upper(),entry.name.upper(),id)
id += 1
--------------end code------------
HELLO_ZERO = 0
HELLO_ONE = 1
HELLO_TWO = 2
HELLO_THREE = 3
HELLO_FIVE = 5
HELLO_SIX = 6
HELLO_TEN = 10
BLAH_ALPHA = 0
BLAH_BETA = 1
BLAH_GAMMA = 10
BLAH_ZETA = 50
-Mark
Python and pythoneers are amazing!
AggieDan04
2009-08-19 06:17:55 UTC
Permalink
On Aug 18, 6:03?pm, Ludo
Post by Ludo
Hello,
I work in a very large project where we have C++ packages and pieces of
python code.
I've been googleing for days but what I find seems really too
complicated for what I want to do.
My business is, in python, to read enum definitions provided by the
header file of an c++ package.
Of course I could open the .h file, read the enum and transcode it by
hand into a .py file but the package is regularly updated and thus is
the enum.
? ? ? ? - either a simple way in python to read the .h file, retrieve the c++
enum and provide an access to it in my python script
Try something like this:



file_data = open(filename).read()
# Remove comments and preprocessor directives
file_data = ' '.join(line.split('//')[0].split('#')[0] for line in
file_data.splitlines())
file_data = ' '.join(re.split(r'\/\*.*\*\/', file_data))
# Look for enums: In the first { } block after the keyword "enum"
enums = [text.split('{')[1].split('}')[0] for text in re.split(r'\benum
\b', file_data)[1:]]

for enum in enums:
last_value = -1
for enum_name in enum.split(','):
if '=' in enum_name:
enum_name, enum_value = enum_name.split('=')
enum_value = int(enum_value, 0)
else:
enum_value = last_value + 1
last_value = enum_value
enum_name = enum_name.strip()
print '%s = %d' % (enum_name, enum_value)
print
Neil Hodgson
2009-08-19 08:12:48 UTC
Permalink
Post by AggieDan04
file_data = open(filename).read()
# Remove comments and preprocessor directives
file_data = ' '.join(line.split('//')[0].split('#')[0] for line in
file_data.splitlines())
file_data = ' '.join(re.split(r'\/\*.*\*\/', file_data))
For some headers I tried it didn't work until the .* was changed to a
non-greedy .*? to avoid removing from the start of the first comment to
the end of the last comment.

file_data = ' '.join(re.split(r'\/\*.*?\*\/', file_data))

Neil
Ludo
2009-08-19 15:37:56 UTC
Permalink
Post by Neil Hodgson
For some headers I tried it didn't work until the .* was changed to a
non-greedy .*? to avoid removing from the start of the first comment to
the end of the last comment.
file_data = ' '.join(re.split(r'\/\*.*?\*\/', file_data))
Thank you ! I adopt it !

Cheers.
Brian
2009-08-20 04:08:46 UTC
Permalink
pygccxml http://www.language-binding.net/pygccxml/pygccxml.html

It uses gccxml to compile your source code into xml, and then makes all of
your source code available to you via a high level and convenient query
interface in python.

On Tue, Aug 18, 2009 at 5:03 PM, Ludo <
Post by Ludo
Hello,
I work in a very large project where we have C++ packages and pieces of
python code.
I've been googleing for days but what I find seems really too complicated
for what I want to do.
My business is, in python, to read enum definitions provided by the header
file of an c++ package.
Of course I could open the .h file, read the enum and transcode it by hand
into a .py file but the package is regularly updated and thus is the enum.
- either a simple way in python to read the .h file, retrieve the
c++ enum and provide an access to it in my python script
- either a simple tool (in a long-term it would be automatically run
when the c++ package is compiled) generating from the .h file a .py file
containing the python definition of the enums ?
Thank you for any suggestion.
--
http://mail.python.org/mailman/listinfo/python-list
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20090819/99c3b0e0/attachment.html>
Continue reading on narkive:
Loading...