Discussion:
[fonc] Porting CodeGenerator vs VPU
Ryan Davis
2008-07-14 20:45:52 UTC
Permalink
Are there any notable pro's / con's to porting CodeGenerator vs VPU to
a different language? I'm just looking to experiment at this stage, so
I don't need the most advanced thing in the world, but if there is a
compelling reason, I'll do the bigger porting job.
John Leuner
2008-07-16 08:47:59 UTC
Permalink
The libvpu.a used by the VPU target is not open source.

The BURG target does register allocation and should produce more
efficient code.

What language are you porting to? I have done a port to Common Lisp.

John
Post by Ryan Davis
Are there any notable pro's / con's to porting CodeGenerator vs VPU to
a different language? I'm just looking to experiment at this stage, so
I don't need the most advanced thing in the world, but if there is a
compelling reason, I'll do the bigger porting job.
_______________________________________________
fonc mailing list
http://vpri.org/mailman/listinfo/fonc
Ryan Davis
2008-07-17 21:26:18 UTC
Permalink
Post by John Leuner
The libvpu.a used by the VPU target is not open source.
I'm not sure I care at this stage (just experimenting) but... I don't
see a libvpu.a... am I missing something? Nowhere in any of the source
or makefiles is libvpu or lvpu. Clue me?
Post by John Leuner
The BURG target does register allocation and should produce more
efficient code.
What language are you porting to? I have done a port to Common Lisp.
I'm porting to ruby.
Ian Piumarta
2008-07-16 20:50:32 UTC
Permalink
Ryan,
Post by Ryan Davis
Are there any notable pro's / con's to porting CodeGenerator vs VPU
to a different language?
The VPU model is nice because it requires a minimum amount of state
(just one object) to be maintained in the client. No clumsy data
structures to construct and manipulate. You can write, compile and
run a VPU-based program that adds 3 to 4 to make 7 in less than a
minute. To do the same with the CodeGenerator means, at the very
minimul, you can construct ASTs and agree with the CodeGenerator on
how they are represented.

The reason the VPU was abandoned in the CodeGenerator is that the
very first thing the VPU does with the 'program' that you feed it (in
the form of function calls, terminating in 'compile()') is to
translate that program into something that looks just like an AST
(see [1] for details). Rather than have Jolt take an AST, linearise
it into a sequence of VPU function calls, then reconstruct an almost
identical AST in the VPU, I figured I'd cut out the step in the
middle and just translate Jolt s-expressions down into (effectively)
the AST the the VPU would have constructed. The internal processes
that convert the low-level AST into binary code in the VPU and the
CodeGenerator are actually very similar in spirit, but somewhat more
elegant in the CodeGenerator.

If you look at

'(RETI4 (ADDI4 (CNSTI4 3) (CNSTI 4)))

and

vpu->LdI4(3)
->LdI4(4)
->AddI4()
->RetI4();

you can see pretty quickly how one representation can be translated
trivially into the other. I think whichever you decide to
investigate further, you'll rapidly end up solving identical problems.

It has always been my intention to (eventually) create a standalone
CodeGenerator back-end library that provides a VPU-like interface.
This would be a very thin wrapper, just enough to reconstruct the AST
from the sequence of VPU function calls.
Post by Ryan Davis
I'm just looking to experiment at this stage, so I don't need the
most advanced thing in the world, but if there is a compelling
reason, I'll do the bigger porting job.
The CodeGenerator is weak in a number of places and is due for a
total rewrite. The handling of automatic variables is poor, and the
code generated for control structures is abysmal. I've been
experimenting with a kind of 'lightweight continuation-passing' style
of code generation that I think I might adopt in Jolt3 (this approach
is particularly good at generating optimal control structures with
very little effort/complexity). Jolt3 will also use a single
framework for top-down parsing of source, intermeditate AST-to-AST
analysis/transformation/optimisation, and bottom-up tree rewriting
for concrete instruction selection.

Cheers,
Ian
Ian Piumarta
2008-07-16 20:54:32 UTC
Permalink
Oops...
the very first thing the VPU does with the 'program' that you feed
it is to translate that program into something that looks just like
an AST (see [1] for details).
[1] http://www.usenix.org/events/vm04/tech/piumarta.html

Sorry,
Ian
Igor Stasenko
2008-07-16 21:19:50 UTC
Permalink
Post by Ian Piumarta
Oops...
the very first thing the VPU does with the 'program' that you feed it is
to translate that program into something that looks just like an AST (see
[1] for details).
VPU is much better approach for code translation. No dependency from
C. Direct access to low-level operations with memory and stack, and
even with additional coding - to different hardware features.
Post by Ian Piumarta
[1] http://www.usenix.org/events/vm04/tech/piumarta.html
Sorry,
Ian
_______________________________________________
fonc mailing list
http://vpri.org/mailman/listinfo/fonc
--
Best regards,
Igor Stasenko AKA sig.
Ryan Davis
2008-07-17 23:48:45 UTC
Permalink
Post by Igor Stasenko
Post by Ian Piumarta
Oops...
the very first thing the VPU does with the 'program' that you feed it is
to translate that program into something that looks just like an AST (see
[1] for details).
VPU is much better approach for code translation. No dependency from
C. Direct access to low-level operations with memory and stack, and
even with additional coding - to different hardware features.
I'm torn. That sounds lovely, but I'd also like to experiment with
going towards a register based machine, rather than stack. But
really... the machine code generator side is my weak point, so I
should be doing all of them as many times as I can.
Ian Piumarta
2008-07-18 01:23:18 UTC
Permalink
There seems to be some confusion (probably mostly in my head)...
Post by Ryan Davis
Post by Igor Stasenko
the very first thing the VPU does with the 'program' that you feed it is
to translate that program into something that looks just like an AST (see
[1] for details).
VPU is much better approach for code translation. No dependency from
C. Direct access to low-level operations with memory and stack, and
even with additional coding - to different hardware features.
I'm torn. That sounds lovely, but I'd also like to experiment with
going towards a register based machine
The VPU presented the end user with a stack abstraction that compiles
into register-based instructions (as is the CodeGenerator, except
that its stack is implicit in the 'imaginary call graph' of the s-
expressions you feed to it). Nothing whatsoever remains of the VPU's
stack in the final generated code. It did just as much register
allocation as the CodeGenerator does. Neither the VPU nor the
CodeGenerator presents the client with a register-based model, and
both of them target register-register instruction sets. Decent
register allocation is one of the trickiest parts of code generation
(especially for asymmetrical, irregular architectures like the 386)
which is why both the VPU and the CodeGenerator (try their best to :)
take care of it for their client programmers.

The VPU had a generic 'user-defined instruction' that could be
parameterised to generate any desired sequence of concrete
instructions while taking part in register allocation, etc.; part of
the rationale for this approach was that C++ isn't particularly
conducive to extending class hierarchies with virtual functions in
arbitrary ways at runtime. The CodeGenerator is a little easier to
extend (in large part because it isn't written in C++) since new
types can be added under Instruction dynamically at run time and then
embedded into the AST entirely in end-user land, via (syntax ...),
and the internal workings are (IMHO) somewhat more transparent and
sometimes almost verging on 'table driven' in spirit (at least for
the instruction selection and instruction-specific register
allocation rules).
Post by Ryan Davis
the machine code generator side is my weak point, so I should be
doing all of them as many times as I can.
It's also the part of a compiler that is traditionally ignored, or at
best treated superficially and rapidly, in compiler textbooks.

Cheers,
Ian
Ryan Davis
2008-07-19 04:07:41 UTC
Permalink
Post by Ian Piumarta
The VPU presented the end user with a stack abstraction that
compiles into register-based instructions [... blah blah it'll just
work blah blah ...]
Ian,

Have I mentioned lately that I love you?

*kisses*,

Ryan
Ryan Davis
2008-07-17 21:31:46 UTC
Permalink
Post by Ian Piumarta
The VPU model is nice because it requires a minimum amount of state
(just one object) to be maintained in the client. No clumsy data
structures to construct and manipulate. You can write, compile and
run a VPU-based program that adds 3 to 4 to make 7 in less than a
minute. To do the same with the CodeGenerator means, at the very
minimul, you can construct ASTs and agree with the CodeGenerator on
how they are represented.
VPU sounds nice and simple... I'm taking a whack at porting it now and
by using more metaprogramming am getting roughly a 2:1 compression
ratio out of the code. I think other areas (like not having your {}
form vs the regular []) will make that up soon.
Post by Ian Piumarta
The reason the VPU was abandoned in the CodeGenerator is that the
very first thing the VPU does with the 'program' that you feed it
(in the form of function calls, terminating in 'compile()') is to
translate that program into something that looks just like an AST
(see [1] for details). Rather than have Jolt take an AST, linearise
it into a sequence of VPU function calls, then reconstruct an almost
identical AST in the VPU, I figured I'd cut out the step in the
middle and just translate Jolt s-expressions down into (effectively)
the AST the the VPU would have constructed. The internal processes
that convert the low-level AST into binary code in the VPU and the
CodeGenerator are actually very similar in spirit, but somewhat more
elegant in the CodeGenerator.
Actually I find the AST approach ideal. 1) I have a lot of tools to
deal with that already and can merge with many of my tools (I wrote
ParseTree and SexpProcessor for ruby) and projects (I work on
rubinius, which is AST-to-bytecode currently).
Post by Ian Piumarta
If you look at
'(RETI4 (ADDI4 (CNSTI4 3) (CNSTI 4)))
and
vpu->LdI4(3)
->LdI4(4)
->AddI4()
->RetI4();
Yeah... but the AST approach means you can quickly and easily slip
another layer in there without having to touch any other code. I think
I want to go with a billion thin layers to allow for easier
experimentation.
Post by Ian Piumarta
Post by Ryan Davis
I'm just looking to experiment at this stage, so I don't need the
most advanced thing in the world, but if there is a compelling
reason, I'll do the bigger porting job.
The CodeGenerator is weak in a number of places and is due for a
total rewrite. The handling of automatic variables is poor, and the
code generated for control structures is abysmal. I've been
experimenting with a kind of 'lightweight continuation-passing'
style of code generation that I think I might adopt in Jolt3 (this
approach is particularly good at generating optimal control
structures with very little effort/complexity). Jolt3 will also use
a single framework for top-down parsing of source, intermeditate AST-
to-AST analysis/transformation/optimisation, and bottom-up tree
rewriting for concrete instruction selection.
I assume that VPU has similar problems in it's generated code?
Michael FIG
2008-07-23 05:33:38 UTC
Permalink
Hi,
Jolt3 will also use a single framework for top-down parsing of
source, intermeditate AST-to-AST
analysis/transformation/optimisation, and bottom-up tree rewriting
for concrete instruction selection.
I think this is one of the more exciting things happening in jolt3!

After staring at function/jolt3, I think I finally am grasping some of
the design you are working on. Here are some notes that I've made to
help other people understand, as well as to raise some questions I
have:

* "%" signs introduce end-of-line comments

* The first line of a grammar is an object declaration. Since
productions become methods on an object, grammars can inherit from
other grammars and thereby extend or override the parent's
productions.

* Grammars that don't inherit from other grammars should inherit from
Parser, which provides the infrastructure used by the generated .st
file.

* The "start" production/method is called by Parser>>parse:, so it's
the default production.

* "<SomeProduction>" matches a production against the current token
stream, and saves its results back into the front of the input
stream for further rules to process. So, you can build up a
pipeline with stuff like "<compile <optimise <parse>>>"

* "#(...)" in a rule matches a sequenceable collection (i.e. a Jolt
expression or a vector, or whatever)

* "#ident" matches a symbol

* The value of a production follows "->". "{ ... }" is an idst block
to evaluate. "`(...)", "(...)", or "VARNAME" are jolt values.

* Elements of a production can be captured in a variable by writing
":VARNAME" after them. If you want to capture a string of the
match instead of its value, write "$:VARNAME".

* QUESTION: "!" and "&" are predicates. For "&", if the element it
precedes evaluates to true, it matches, otherwise its clause fails
to match. "!" is the reverse: if the result is true it fails,
otherwise succeeds. (Is this accurate??)

* GrammarParser.g reduces a grammar to an AST

* g2st converts a .g file into an idst .st source file. It does its
work with PepsiGrammarGenerator.g, which inherits from GrammarParser
and walks the AST to print idst source code

Do you forsee any major changes to the grammar syntax? Once you have
it nailed down I'm eager to continue working on my C preprocessor
(I've been taking a break from it to enjoy the summer, as well as stop
investing in the jolt2 cul-de-sac).

function/jolt3/model.c also looks pretty appetizing. For those of you
on the list unfamiliar with it (I had the good fortune of witnessing
the whiteboard explanations), it's the beginnings of a new libid that
uses alists instead of vectors for vtables. This allows even more
sharing of implementation and insertion of methods in interesting
places. It also reduces the number of bootstrap objects.

BTW, I have been doing my homework and now have some promising ideas
for extensions to the object model (namely parameterisable garbage
collection compatible with thread-safe coroutines) that will probably
make the core simpler and more flexible. I think I'll wait to do
those experiments until after it's possible to write libid in Coke
syntax.

Ian, if there is anything that could be done to help expedite the work
you're doing, would you be able to describe that to the mailing list?

Hope all is well with you,
--
Michael FIG <***@fig.org> //\
http://michael.fig.org/ \//
Kjell Godo
2008-08-23 09:22:53 UTC
Permalink
AST = Abstract Syntax Tree?
Post by Michael FIG
Hi,
Jolt3 will also use a single framework for top-down parsing of
source, intermeditate AST-to-AST
analysis/transformation/optimisation, and bottom-up tree rewriting
for concrete instruction selection.
I think this is one of the more exciting things happening in jolt3!
After staring at function/jolt3, I think I finally am grasping some of
the design you are working on. Here are some notes that I've made to
help other people understand, as well as to raise some questions I
* "%" signs introduce end-of-line comments
* The first line of a grammar is an object declaration. Since
productions become methods on an object, grammars can inherit from
other grammars and thereby extend or override the parent's
productions.
* Grammars that don't inherit from other grammars should inherit from
Parser, which provides the infrastructure used by the generated .st
file.
* The "start" production/method is called by Parser>>parse:, so it's
the default production.
* "<SomeProduction>" matches a production against the current token
stream, and saves its results back into the front of the input
stream for further rules to process. So, you can build up a
pipeline with stuff like "<compile <optimise <parse>>>"
* "#(...)" in a rule matches a sequenceable collection (i.e. a Jolt
expression or a vector, or whatever)
* "#ident" matches a symbol
* The value of a production follows "->". "{ ... }" is an idst block
to evaluate. "`(...)", "(...)", or "VARNAME" are jolt values.
* Elements of a production can be captured in a variable by writing
":VARNAME" after them. If you want to capture a string of the
match instead of its value, write "$:VARNAME".
* QUESTION: "!" and "&" are predicates. For "&", if the element it
precedes evaluates to true, it matches, otherwise its clause fails
to match. "!" is the reverse: if the result is true it fails,
otherwise succeeds. (Is this accurate??)
* GrammarParser.g reduces a grammar to an AST
* g2st converts a .g file into an idst .st source file. It does its
work with PepsiGrammarGenerator.g, which inherits from GrammarParser
and walks the AST to print idst source code
Do you forsee any major changes to the grammar syntax? Once you have
it nailed down I'm eager to continue working on my C preprocessor
(I've been taking a break from it to enjoy the summer, as well as stop
investing in the jolt2 cul-de-sac).
function/jolt3/model.c also looks pretty appetizing. For those of you
on the list unfamiliar with it (I had the good fortune of witnessing
the whiteboard explanations), it's the beginnings of a new libid that
uses alists instead of vectors for vtables. This allows even more
sharing of implementation and insertion of methods in interesting
places. It also reduces the number of bootstrap objects.
BTW, I have been doing my homework and now have some promising ideas
for extensions to the object model (namely parameterisable garbage
collection compatible with thread-safe coroutines) that will probably
make the core simpler and more flexible. I think I'll wait to do
those experiments until after it's possible to write libid in Coke
syntax.
Ian, if there is anything that could be done to help expedite the work
you're doing, would you be able to describe that to the mailing list?
Hope all is well with you,
--
http://michael.fig.org/ \//
_______________________________________________
fonc mailing list
http://vpri.org/mailman/listinfo/fonc
Joshua Gargus
2008-08-23 16:17:52 UTC
Permalink
Post by Kjell Godo
AST = Abstract Syntax Tree?
Yes.

Loading...