Discussion:
[fonc] ( picoVerse-:( picoLARC assembler and FoNC ) )
Kjell Godo
2008-10-06 16:03:21 UTC
Permalink
picoLARC gets a lot of ideas from FoNC.

I am trying to make a small assembler of sorts in Dolphin Smalltalk.
I am using the Intel IA-32 manual to do it. It is ambiguous a lot of the
time.
I have got the addressing simulator coded up. ModR/M and SIB.
Next is the expression compilers and byte code generator bits.
I would like to do a clean phat documented job that might serve as a live
replacement for the IA-32 manuals.

The main idea is to generate machine code into a ByteArray like Udo
did it in his US_Inline_Assembler packages and stick that into an
ExternalMethod
and run it from there. It has been suggested that callbacks into Dolphin
could
be made for access to windowing etc. Using picoLARC assembler perhaps
a VM in a large ByteArray might be made that has its own data section with
garbage collectors etc that could run a version of picoLARC more quickly
than
in Dolphin.

I probably need some kind of help from somewhere. I'm looking at
Bluishcoder's
blog which seems good to me but I don't know JavaScript. I was thinking
of using CodeX or Masm32 or Nasm to help me to get the byte code
generator right. There is also the OllyDbg debugger or something.

Does anyone know of any other good Windows assembler debuggers?

Does Windows have a dlopen()/dlsym()?

Is there a place that shows how Windows calls are made at the assembler
level?
I have seen Masm32 but I don't notice any source code available.

I am gradually trying to get so I can compile fonc on Windows. But I'm no C
programmer. I can read it sort of.
I downloaded Cygwin and the developement packages but it is huge. The Ocean
project seemed to do it using ming.
Should I uninstall Cygwin and use mingw32 instead?
To try to learn how to use all the stuff in Cygwin seems daunting.
Is there an IDE that you use? Or do you just use commandline compilers like
gcc etc.
Do any of the Microsoft compilers generate machine code anymore?

Does fonc have a machine code generator in it or does it generate assembler
which is
then compiled into machine code by some external back end? I am interested
in the
parts where ModR/M and SIB and the opCodes hit the road.

It would be cool to be able to generate a DLL file from inside of Dolphin
Smalltalk and
then link to it and run it.
Why isn't Cola implemented in Win32 also?
There is some support for native Win32. Last time I checkedthe 'idc'
compiler, 'jolt-burg' and the canvas stuff compiled and ran on Win32 (using
mingw32, with no dependencies on Cygwin, Interix, etc...).
I am interested in how executable code is generated in
general. Can you point me at any books or info about
how such things are done?
I don't know of any books on dynamic code generation specifically, but you
could try citeseer (or just plain google) to look for conference/workshop
papers on dynamic code generation. Many of the techniques used in static
compilation are applicatble to dynamic compilation. The differences are
(IMO) carefully selecting optimisations that have a high performance/price
ratio and designing the runtime with late binding in mind.
By late binding you mean leaving holes in the executable code where links to
external libraries can be put?
COLA does its dynamic linking explicitly, using dlopen()/dlsym() to find
the addresses of functions and variables at run time.
Thank-you for describing picoLARC. The goals seem very similar to ours.
I'll take a look at the SF.net project.
Cheers,
Ian
Denver Gingerich
2008-10-06 16:18:46 UTC
Permalink
On Mon, Oct 6, 2008 at 12:03 PM, Kjell Godo wrote:
[...]
Post by Kjell Godo
I probably need some kind of help from somewhere. I'm looking at
Bluishcoder's
blog which seems good to me but I don't know JavaScript. I was thinking
of using CodeX or Masm32 or Nasm to help me to get the byte code
generator right. There is also the OllyDbg debugger or something.
Does anyone know of any other good Windows assembler debuggers?
Does Windows have a dlopen()/dlsym()?
Is there a place that shows how Windows calls are made at the assembler
level?
I have seen Masm32 but I don't notice any source code available.
For disassembling and debugging, I really like IDA Pro Freeware. The
feature that really sets it apart from other free disassemblers I've
seen is its call graph feature. With call graphs, you can easily see
how the control flow of the assembly works, which is invaluable in
understanding how a given binary works. It also shows the name of the
Windows system call next to the assembly code where it is called.

To get IDA Pro Freeware, use the Download link from the main IDA Pro website:

http://www.hex-rays.com/idapro/

Denver
Kjell Godo
2008-11-29 21:57:01 UTC
Permalink
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
[...]
Post by Kjell Godo
I probably need some kind of help from somewhere. I'm looking at
Bluishcoder's
blog which seems good to me but I don't know JavaScript. I was thinking
of using CodeX or Masm32 or Nasm to help me to get the byte code
generator right. There is also the OllyDbg debugger or something.
Does anyone know of any other good Windows assembler debuggers?
Does Windows have a dlopen()/dlsym()?
Is there a place that shows how Windows calls are made at the assembler
level?
I have seen Masm32 but I don't notice any source code available.
For disassembling and debugging, I really like IDA Pro Freeware. The
feature that really sets it apart from other free disassemblers I've
seen is its call graph feature. With call graphs, you can easily see
how the control flow of the assembly works, which is invaluable in
understanding how a given binary works. It also shows the name of the
Windows system call next to the assembly code where it is called.
In Ollydbg you can easily compile a single Assembler instruction
and see the results easily in the open and running debugger. You
can see the hex bytes of the compiled instruction. You don't have
to create any text files or learn any compilation commands or
anything. You just open a .exe file and right click to get up the
single instruction compiler and type in the Assembler and hit the
button and out comes the hex bytes of the compiled result. Can
IDA Pro do something like this? I already know how to make
Ollydbg do this much so whatever IDA Pro can do would need to
be this easy to make it work. It would be interesting to see if
IDA Pro outputs the same hex codes as Ollydbg in the multiple
cases where several machine codes do the same thing.
http://www.hex-rays.com/idapro/
Denver
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: http://getfiregpg.org
iD8DBQFI6josq02IUA/pi34RAl6MAJ4gmt+WxBWah/mw08bDjcSQIQ2QFwCeM4bE
AchuOlraCHtkQccgCfvc2OE=
=tLhp
-----END PGP SIGNATURE-----
Gerardo Richarte
2008-10-06 17:21:41 UTC
Permalink
Post by Kjell Godo
picoLARC gets a lot of ideas from FoNC.
wow, amazing project.
Post by Kjell Godo
Does anyone know of any other good Windows assembler debuggers?
OllyDbg is the choice of the hobbists (http://www.ollydbg.de/)
Version 2.0 is still unstable.
IDA Pro is the choice of the pro (http://www.hex-rays.com/idapro/)
I know this twp lines will generate reactions :)
Post by Kjell Godo
Does Windows have a dlopen()/dlsym()?
Yes, kind of.

LoadLibraryA() and GetModuleHandleA() are dlopen(). The latter if
you know the library is already in memory, the former regardless of
whether it's loaded or not (i.e. use LoadLibraryA()).

GetProcAddress() is dlsym().

To use, in assembly:

push dllName
call LoadLibraryA

push dunctionName
push eax ; module handle, returned from LoadLibraryA
call GetProcAddress
; eax contains the pointer to the looked up symbol.

calling convention for most libraries in Windows is STDCALL:
arguments pushed as C (first closer to the return address)
stack is cleaned by callee (functions use RET n to return).

For some weird cases, the stack pointer has to be aligned to 4,
this should not be a problem normally, but in the weird case
that it is, it's not easy to debug.
Post by Kjell Godo
Is there a place that shows how Windows calls are made at the
assembler level?
Well... I can help you a lot here, also disassembling (with IDA) any
C program
you compile should be a good source, however, if you do this, the DLLs
will be
linked at compile time, and not resolved using
LoadLibrary()/GetProcAddress().
If you have specific questions send them to the list. I'll try to be
responsive.
Post by Kjell Godo
Is there an IDE that you use? Or do you just use commandline
compilers like gcc etc.
Do any of the Microsoft compilers generate machine code anymore?
I don't use DevC++ (http://www.bloodshed.net/devcpp.html) but I know
of people who use it. If you C code does not have calls to the OS (as I
think FONC is), you should not need cygwin, and DevC++ should be enough
for your needs.
Post by Kjell Godo
Does fonc have a machine code generator in it or does it generate
assembler which is
then compiled into machine code by some external back end? I am
interested in the
parts where ModR/M and SIB and the opCodes hit the road.
I'm not really into the FONC project, so take my words here as a rummor.
As far as I know, it does not directly spit machine code and it relies on an
external assembler... please, somebody correct me! (or confirm)
Post by Kjell Godo
It would be cool to be able to generate a DLL file from inside of
Dolphin Smalltalk and
then link to it and run it.
Yes it will :)


If you get closer to implementing this, I could also give you a hand
here.

richie
Kjell Godo
2008-10-08 06:40:26 UTC
Permalink
Thank you so much Gerardo

I am rusty at assembly so I will try here to repeat what you said back and
see if my interpretation is right.
Post by Gerardo Richarte
push dllName
push a 4 byte pointer which is the address of the first byte of the DLLName
Post by Gerardo Richarte
call LoadLibraryA
* * * * * * * * * * * * *
How does the assembler get the pointer to the LoadLibraryA function?
If you could tell me that or point me at where it is written it would make
my day. make that my year.
Post by Gerardo Richarte
push functionName
push eax ; module handle, returned from LoadLibraryA
call GetProcAddress
; eax contains the pointer to the looked up symbol.
So what you are doing is loading or finding a DLL and then getting
the address of a function within that DLL. And it seems like all
Windows functions return their return value in EAX. Is this true
for C in general?
Post by Gerardo Richarte
arguments pushed as C (first closer to the return address)
stack is cleaned by callee (functions use RET n to return).
So once I get the pointer to the looked up symbol then I can
call that pointer doing( in psuedo assembler ):

push lastInput
...
push firstInput
call lookedUpPointer

where each input is a 4 byte pointer in the example above.

So it looks like once I get LoadLibrary and GetProcAddress
calls going then I can get all other calls to all other DLL functions
going at least dynamically at runtime. But how I get the address
of LoadLibrary and GetProcAddress themselves I don't yet know.

Anyhow: Wow that's a great reply for me ! It really tells me a lot.

So in the ByteArray I am dynamically assembling into
I would reserve some 4 byte slots for the addresses
of each of the Windows functions I want to call and then initially
look them up and stick the addresses into those slots and then
use ModR/M SIB addressing to call those functions from anywhere
inside of the ByteArray using the call opCode after the functions
were all looked up.

So I allocate a ByteArray in Smalltalk. Load machine code into
it that has reserved slots for all the needed function addresses.
I stick the ByteArray into an ExternalMethod and call it. Initialization
code in the ByteArray looks up all the function addresses and
sticks them into the slots. Now somewhere in the ByteArray there
might be a call instruction which uses ModR/M addressing in a
call instruction to get the function address from the right slot and
so make the call. After the inputs where pushed onto the stack
in reverse order that is.
Post by Gerardo Richarte
richie
Cedric Roux
2008-10-08 07:05:26 UTC
Permalink
Post by Kjell Godo
So what you are doing is loading or finding a DLL and then getting
the address of a function within that DLL. And it seems like all
Windows functions return their return value in EAX. Is this true
for C in general?
Post by Gerardo Richarte
arguments pushed as C (first closer to the return address)
stack is cleaned by callee (functions use RET n to return).
For the calling conventions and how values are returned, see:
http://www.sco.com/developers/devspecs/abi386-4.pdf
which is specific to intel i386. There are others ABI on the website too.
I think windows follows the ABI.

Cedric.
Kjell Godo
2008-10-09 04:47:23 UTC
Permalink
Wow thankyou so much for that info.
Post by Cedric Roux
Post by Kjell Godo
So what you are doing is loading or finding a DLL and then getting
the address of a function within that DLL. And it seems like all
Windows functions return their return value in EAX. Is this true
for C in general?
Post by Gerardo Richarte
arguments pushed as C (first closer to the return address)
stack is cleaned by callee (functions use RET n to return).
http://www.sco.com/developers/devspecs/abi386-4.pdf
which is specific to intel i386. There are others ABI on the website too.
I think windows follows the ABI.
Cedric.
_______________________________________________
fonc mailing list
http://vpri.org/mailman/listinfo/fonc
John Leuner
2008-10-06 20:52:09 UTC
Permalink
Post by Kjell Godo
Does fonc have a machine code generator in it or does it generate
assembler which is
then compiled into machine code by some external back end? I am
interested in the
parts where ModR/M and SIB and the opCodes hit the road.
Do an SVN checkout of

http://piumarta.com/svn2/idst/trunk

and look at the .st files in the jolt-burg directory. This implements an
x86 and other code generators.

I have also ported this code generator to common lisp, have a look at
the files here:

http://subvert-the-dominant-paradigm.net/repos/hgwebdir.cgi/bootstrap/file/b87b87521259/burg/

John
Post by Kjell Godo
It would be cool to be able to generate a DLL file from inside of
Dolphin Smalltalk and
then link to it and run it.
On Mon, Nov 19, 2007 at 12:20 PM, Ian Piumarta
Why isn't Cola implemented in Win32 also?
There is some support for native Win32. Last time I
checkedthe 'idc' compiler, 'jolt-burg' and the canvas stuff
compiled and ran on Win32 (using mingw32, with no dependencies
on Cygwin, Interix, etc...).
I am interested in how executable code is generated in
general. Can you point me at any books or info about
how such things are done?
I don't know of any books on dynamic code generation
specifically, but you could try citeseer (or just plain
google) to look for conference/workshop papers on dynamic code
generation. Many of the techniques used in static compilation
are applicatble to dynamic compilation. The differences are
(IMO) carefully selecting optimisations that have a high
performance/price ratio and designing the runtime with late
binding in mind.
By late binding you mean leaving holes in the executable code where
links to external libraries can be put?
COLA does its dynamic linking explicitly, using
dlopen()/dlsym() to find the addresses of functions and
variables at run time.
Thank-you for describing picoLARC. The goals seem very
similar to ours. I'll take a look at the SF.net project.
Cheers,
Ian
_______________________________________________
fonc mailing list
http://vpri.org/mailman/listinfo/foncn
Kjell Godo
2008-10-08 05:31:53 UTC
Permalink
Thank you so much for this pointer. I don't know how to do an SVN checkout
but I will try to figure it out.

On Mon, Oct 6, 2008 at 1:52 PM, John Leuner <
Post by John Leuner
Post by Kjell Godo
Does fonc have a machine code generator in it or does it generate
assembler which is
then compiled into machine code by some external back end? I am
interested in the
parts where ModR/M and SIB and the opCodes hit the road.
Do an SVN checkout of
http://piumarta.com/svn2/idst/trunk
and look at the .st files in the jolt-burg directory. This implements an
x86 and other code generators.
I have also ported this code generator to common lisp, have a look at
http://subvert-the-dominant-paradigm.net/repos/hgwebdir.cgi/bootstrap/file/b87b87521259/burg/
John
Post by Kjell Godo
It would be cool to be able to generate a DLL file from inside of
Dolphin Smalltalk and
then link to it and run it.
On Mon, Nov 19, 2007 at 12:20 PM, Ian Piumarta
Why isn't Cola implemented in Win32 also?
There is some support for native Win32. Last time I
checkedthe 'idc' compiler, 'jolt-burg' and the canvas stuff
compiled and ran on Win32 (using mingw32, with no dependencies
on Cygwin, Interix, etc...).
I am interested in how executable code is generated in
general. Can you point me at any books or info about
how such things are done?
I don't know of any books on dynamic code generation
specifically, but you could try citeseer (or just plain
google) to look for conference/workshop papers on dynamic code
generation. Many of the techniques used in static compilation
are applicatble to dynamic compilation. The differences are
(IMO) carefully selecting optimisations that have a high
performance/price ratio and designing the runtime with late
binding in mind.
By late binding you mean leaving holes in the executable code where
links to external libraries can be put?
COLA does its dynamic linking explicitly, using
dlopen()/dlsym() to find the addresses of functions and
variables at run time.
Thank-you for describing picoLARC. The goals seem very
similar to ours. I'll take a look at the SF.net project.
Cheers,
Ian
_______________________________________________
fonc mailing list
http://vpri.org/mailman/listinfo/foncn
_______________________________________________
fonc mailing list
http://vpri.org/mailman/listinfo/fonc
Kjell Godo
2008-10-13 04:08:22 UTC
Permalink
Post by John Leuner
Post by Kjell Godo
Does fonc have a machine code generator in it or does it generate
assembler which is
then compiled into machine code by some external back end? I am
interested in the
parts where ModR/M and SIB and the opCodes hit the road.
Do an SVN checkout of
http://piumarta.com/svn2/idst/trunk
and look at the .st files in the jolt-burg directory. This implements an
x86 and other code generators.
I looked at the jolt-burg directory
There seems to be a static code generator and a dynamic code generator.
Does the dynamic code generator spit out binary?
The code generator that I was looking at seemed to be outputting ascii
assembler source code that I assume would then be sent to some assembler
like gcc or something.

What is a ReductiveGrammar ?

I can't say that I was able to make heads or tails of much of anything
in there though. Nothing is commented. I wish there could be some
long comments in there that would explain what was going on. It all
looks very high tech. I would like to understand it all but I can't.
yet.

How about if the code was in HTML or XML and the comments were put
into popup windows that you could click on? That way you could put
much explanations in there without gumming up the code with a lot of
comments.

The style seems to be jillions and jillions of one line methods.

The C stuff is very hard to read. I just kind of stare at it like a
hedge hog in the head lights.
Post by John Leuner
I have also ported this code generator to common lisp, have a look at
http://subvert-the-dominant-paradigm.net/repos/hgwebdir.cgi/bootstrap/file/b87b87521259/burg/
Okay, I will try to look at this too.
Can I run it in any free Windows based Lisps?
I would like a little guided tour test suite that would go from
easy examples to hard ones that show how the code generator
works. In a stepping source code debugger like in Smalltalk.
But Lisp doesn't seem to have that. And then in each method
along the way I would like a little explanation of what it is doing.
Along with inspectors to see what the data looks like.

In other words I would like something just like the picoLARC
project on sourceforge.net . But I don't think I'm probably going
to get it. Oh well, that's okay.

How do I make heads or tails of this stuff?
Post by John Leuner
John
Post by Kjell Godo
It would be cool to be able to generate a DLL file from inside of
Dolphin Smalltalk and
then link to it and run it.
On Mon, Nov 19, 2007 at 12:20 PM, Ian Piumarta
Why isn't Cola implemented in Win32 also?
There is some support for native Win32. Last time I
checkedthe 'idc' compiler, 'jolt-burg' and the canvas stuff
compiled and ran on Win32 (using mingw32, with no dependencies
on Cygwin, Interix, etc...).
I am interested in how executable code is generated in
general. Can you point me at any books or info about
how such things are done?
I don't know of any books on dynamic code generation
specifically, but you could try citeseer (or just plain
google) to look for conference/workshop papers on dynamic code
generation. Many of the techniques used in static compilation
are applicatble to dynamic compilation. The differences are
(IMO) carefully selecting optimisations that have a high
performance/price ratio and designing the runtime with late
binding in mind.
By late binding you mean leaving holes in the executable code where
links to external libraries can be put?
COLA does its dynamic linking explicitly, using
dlopen()/dlsym() to find the addresses of functions and
variables at run time.
Thank-you for describing picoLARC. The goals seem very
similar to ours. I'll take a look at the SF.net project.
Cheers,
Ian
_______________________________________________
fonc mailing list
http://vpri.org/mailman/listinfo/foncn
_______________________________________________
fonc mailing list
http://vpri.org/mailman/listinfo/fonc
Kjell Godo
2008-10-13 04:23:18 UTC
Permalink
Could you please explain:

(:reg (:early-cleanup :reg) ,#'(lambda (op cg)
(movl-reg cg (output (lhs op)) :reg (eax cg))

(emit-epilogue cg)))

:reg (:early-cleanup :reg) ?
What is the op and the cg = codeGenerator?
movl-reg ?
(output (leftHandSide op))?
Post by Kjell Godo
Post by John Leuner
Post by Kjell Godo
Does fonc have a machine code generator in it or does it generate
assembler which is
then compiled into machine code by some external back end? I am
interested in the
parts where ModR/M and SIB and the opCodes hit the road.
Do an SVN checkout of
http://piumarta.com/svn2/idst/trunk
and look at the .st files in the jolt-burg directory. This implements an
x86 and other code generators.
I looked at the jolt-burg directory
There seems to be a static code generator and a dynamic code generator.
Does the dynamic code generator spit out binary?
The code generator that I was looking at seemed to be outputting ascii
assembler source code that I assume would then be sent to some assembler
like gcc or something.
What is a ReductiveGrammar ?
I can't say that I was able to make heads or tails of much of anything
in there though. Nothing is commented. I wish there could be some
long comments in there that would explain what was going on. It all
looks very high tech. I would like to understand it all but I can't.
yet.
How about if the code was in HTML or XML and the comments were put
into popup windows that you could click on? That way you could put
much explanations in there without gumming up the code with a lot of
comments.
The style seems to be jillions and jillions of one line methods.
The C stuff is very hard to read. I just kind of stare at it like a
hedge hog in the head lights.
Post by John Leuner
I have also ported this code generator to common lisp, have a look at
http://subvert-the-dominant-paradigm.net/repos/hgwebdir.cgi/bootstrap/file/b87b87521259/burg/
Okay, I will try to look at this too.
Can I run it in any free Windows based Lisps?
I would like a little guided tour test suite that would go from
easy examples to hard ones that show how the code generator
works. In a stepping source code debugger like in Smalltalk.
But Lisp doesn't seem to have that. And then in each method
along the way I would like a little explanation of what it is doing.
Along with inspectors to see what the data looks like.
In other words I would like something just like the picoLARC
project on sourceforge.net . But I don't think I'm probably going
to get it. Oh well, that's okay.
How do I make heads or tails of this stuff?
Post by John Leuner
John
Post by Kjell Godo
It would be cool to be able to generate a DLL file from inside of
Dolphin Smalltalk and
then link to it and run it.
On Mon, Nov 19, 2007 at 12:20 PM, Ian Piumarta
Why isn't Cola implemented in Win32 also?
There is some support for native Win32. Last time I
checkedthe 'idc' compiler, 'jolt-burg' and the canvas stuff
compiled and ran on Win32 (using mingw32, with no dependencies
on Cygwin, Interix, etc...).
I am interested in how executable code is generated in
general. Can you point me at any books or info about
how such things are done?
I don't know of any books on dynamic code generation
specifically, but you could try citeseer (or just plain
google) to look for conference/workshop papers on dynamic code
generation. Many of the techniques used in static compilation
are applicatble to dynamic compilation. The differences are
(IMO) carefully selecting optimisations that have a high
performance/price ratio and designing the runtime with late
binding in mind.
By late binding you mean leaving holes in the executable code where
links to external libraries can be put?
COLA does its dynamic linking explicitly, using
dlopen()/dlsym() to find the addresses of functions and
variables at run time.
Thank-you for describing picoLARC. The goals seem very
similar to ours. I'll take a look at the SF.net project.
Cheers,
Ian
_______________________________________________
fonc mailing list
http://vpri.org/mailman/listinfo/foncn
_______________________________________________
fonc mailing list
http://vpri.org/mailman/listinfo/fonc
Kjell Godo
2008-10-13 04:41:35 UTC
Permalink
What is ccrs standing for?

(add (ccrs c) (setf (ecx c) (make-instance 'register :register-class
:I4 :name :ecx :encoding #x41)) )

Why is the encoding hex 41? Why isn't it just 1?
Post by Kjell Godo
(:reg (:early-cleanup :reg) ,#'(lambda (op cg)
(movl-reg cg (output (lhs op)) :reg (eax cg))
(emit-epilogue cg)))
:reg (:early-cleanup :reg) ?
What is the op and the cg = codeGenerator?
movl-reg ?
(output (leftHandSide op))?
Post by Kjell Godo
Post by John Leuner
Post by Kjell Godo
Does fonc have a machine code generator in it or does it generate
assembler which is
then compiled into machine code by some external back end? I am
interested in the
parts where ModR/M and SIB and the opCodes hit the road.
Do an SVN checkout of
http://piumarta.com/svn2/idst/trunk
and look at the .st files in the jolt-burg directory. This implements an
x86 and other code generators.
I looked at the jolt-burg directory
There seems to be a static code generator and a dynamic code generator.
Does the dynamic code generator spit out binary?
The code generator that I was looking at seemed to be outputting ascii
assembler source code that I assume would then be sent to some assembler
like gcc or something.
What is a ReductiveGrammar ?
I can't say that I was able to make heads or tails of much of anything
in there though. Nothing is commented. I wish there could be some
long comments in there that would explain what was going on. It all
looks very high tech. I would like to understand it all but I can't.
yet.
How about if the code was in HTML or XML and the comments were put
into popup windows that you could click on? That way you could put
much explanations in there without gumming up the code with a lot of
comments.
The style seems to be jillions and jillions of one line methods.
The C stuff is very hard to read. I just kind of stare at it like a
hedge hog in the head lights.
Post by John Leuner
I have also ported this code generator to common lisp, have a look at
http://subvert-the-dominant-paradigm.net/repos/hgwebdir.cgi/bootstrap/file/b87b87521259/burg/
Okay, I will try to look at this too.
Can I run it in any free Windows based Lisps?
I would like a little guided tour test suite that would go from
easy examples to hard ones that show how the code generator
works. In a stepping source code debugger like in Smalltalk.
But Lisp doesn't seem to have that. And then in each method
along the way I would like a little explanation of what it is doing.
Along with inspectors to see what the data looks like.
In other words I would like something just like the picoLARC
project on sourceforge.net . But I don't think I'm probably going
to get it. Oh well, that's okay.
How do I make heads or tails of this stuff?
Post by John Leuner
John
Post by Kjell Godo
It would be cool to be able to generate a DLL file from inside of
Dolphin Smalltalk and
then link to it and run it.
On Mon, Nov 19, 2007 at 12:20 PM, Ian Piumarta
Why isn't Cola implemented in Win32 also?
There is some support for native Win32. Last time I
checkedthe 'idc' compiler, 'jolt-burg' and the canvas stuff
compiled and ran on Win32 (using mingw32, with no dependencies
on Cygwin, Interix, etc...).
I am interested in how executable code is generated in
general. Can you point me at any books or info about
how such things are done?
I don't know of any books on dynamic code generation
specifically, but you could try citeseer (or just plain
google) to look for conference/workshop papers on dynamic code
generation. Many of the techniques used in static compilation
are applicatble to dynamic compilation. The differences are
(IMO) carefully selecting optimisations that have a high
performance/price ratio and designing the runtime with late
binding in mind.
By late binding you mean leaving holes in the executable code where
links to external libraries can be put?
COLA does its dynamic linking explicitly, using
dlopen()/dlsym() to find the addresses of functions and
variables at run time.
Thank-you for describing picoLARC. The goals seem very
similar to ours. I'll take a look at the SF.net project.
Cheers,
Ian
_______________________________________________
fonc mailing list
http://vpri.org/mailman/listinfo/foncn
_______________________________________________
fonc mailing list
http://vpri.org/mailman/listinfo/fonc
John Leuner
2008-10-17 11:40:32 UTC
Permalink
Post by Kjell Godo
(:reg (:early-cleanup :reg) ,#'(lambda (op cg)
(movl-reg cg (output (lhs op)) :reg (eax cg))
(emit-epilogue cg)))
?
This is a pseudo-instruction that takes the return value as an argument
and places it in the eax register. It then emits the cleanup code to
return from the function.
Post by Kjell Godo
What is the op and the cg = codeGenerator?
op is the early-cleanup instruction. cg is the code generator object.
Post by Kjell Godo
movl-reg ?
moves a long from a register to a register
Post by Kjell Godo
(output (leftHandSide op))?
returns the register allocated for the left-hand side of the
early-cleanup instruction
Post by Kjell Godo
What is ccrs standing for?
call-clobbered registers

These are registers whose values are not preserved after a CALL/RET. The
call-saved registers (ebx, esi, edi) must be saved by a function before
it uses them, and the values restored before returning.
Post by Kjell Godo
(add (ccrs c) (setf (ecx c) (make-instance 'register :register-class
:I4 :name :ecx :encoding #x41)) )
Why is the encoding hex 41? Why isn't it just 1?
If you look at the initialization code:

(defmethod initialize-instance :after ((c i32-code-generator) &key)
(add (ccrs c) (setf (eax c) (make-instance
'register :register-class :I4 :name :eax :encoding #x40)))
(add (ccrs c) (setf (ecx c) (make-instance
'register :register-class :I4 :name :ecx :encoding #x41)) )
(add (ccrs c) (setf (edx c) (make-instance
'register :register-class :I4 :name :edx :encoding #x42)) )
(add (csrs c) (setf (ebx c) (make-instance
'register :register-class :I4 :name :ebx :encoding #x43)) )
(add (csrs c) (setf (esi c) (make-instance
'register :register-class :I4 :name :esi :encoding #x46)) )
(add (csrs c) (setf (edi c) (make-instance
'register :register-class :I4 :name :edi :encoding #x47)) )
(setf (esp c) (make-instance
'register :register-class :P4 :name :esp :encoding #x44))
(setf (ebp c) (make-instance
'register :register-class :P4 :name :ebp :encoding #x45))
(setf (cx c) (make-instance
'register :register-class :I2 :name :cx :encoding #x21))
(setf (cl c) (make-instance
'register :register-class :II :name :cl :encoding #x11))
c)

You can see that 32-bit registers are marked with the #x40 bit, 16-bit
with #x20, and 8-bit with #x10

John
Kjell Godo
2008-11-30 07:04:45 UTC
Permalink
It seems like idc is generating a jolt-berg file which
contains a compiler which has a ReductiveGramar
part that matches psuedo code trees and outputs
ascii assembler which is compiled by a gcc assembler
and that binary machine code winds up in memory
if dynamic compilation is happening. Are you directing
the gcc assembler to do flat assembly into a file and
then reading that file into RAM memory?

There seem to be lots of compilers generating other
compilers in there. I would like to understand this
sort of thing.

Are there any books or papers that tell about what you
are doing? ReductiveGramars and etc.

What exactly is your Common Lisp project about?
Is it a port of fonc Pepsi Cola etc to Lisp? Or
is it something different.

If I want to try to understand the fonc compilers or your
Lisp versions, is there anyplace to start?



On Fri, Oct 17, 2008 at 3:40 AM, John Leuner <
Post by John Leuner
Post by Kjell Godo
(:reg (:early-cleanup :reg) ,#'(lambda (op cg)
(movl-reg cg (output (lhs op)) :reg (eax cg))
(emit-epilogue cg)))
?
This is a pseudo-instruction that takes the return value as an argument
and places it in the eax register. It then emits the cleanup code to
return from the function.
So when the rule generated by the above expression sees an
:early-cleanup op in the tree then it emits
movl eax, <rrr>
<epilogue>
where <rrr> is some register name. And a long is 32 bits.
<rrr> was allocated to receive the result of an instruction's
evaluation. And that result is to be returned in eax.
I would like to know how registers are allocated.
Post by John Leuner
Post by Kjell Godo
What is the op and the cg = codeGenerator?
op is the early-cleanup instruction. cg is the code generator object.
I don't understand: (:reg (:early-cleanup :reg)
It's some kind of pattern matching but I don't get it.
Is (:reg some kind of function call that creates a rule?
Post by John Leuner
Post by Kjell Godo
movl-reg ?
moves a long from a register to a register
Post by Kjell Godo
(output (leftHandSide op))?
returns the register allocated for the left-hand side of the
early-cleanup instruction
Post by Kjell Godo
What is ccrs standing for?
call-clobbered registers
These are registers whose values are not preserved after a CALL/RET. The
call-saved registers (ebx, esi, edi) must be saved by a function before
it uses them, and the values restored before returning.
call saved registers is csrs then. Are the call saved registers just saved
by your program or must they be saved by all programs? If so then why?
Post by John Leuner
Post by Kjell Godo
(add (ccrs c) (setf (ecx c) (make-instance 'register :register-class
:I4 :name :ecx :encoding #x41)) )
Why is the encoding hex 41? Why isn't it just 1?
(defmethod initialize-instance :after ((c i32-code-generator) &key)
(add (ccrs c) (setf (eax c) (make-instance
'register :register-class :I4 :name :eax :encoding #x40)))
So it looks like you are making a register and setting it into the eax
slot in c and also adding it to a set or list of call clobbered registers
also in c. call clobbered registers are registers that can get clobbered
by calls. :l4 is 4 bytes long. :l2 is 2 bytes long. :l1 is 1 byte long.
:P4 is a 4 byte pointer. This method is evaluated after the
initialize-instance method.
Post by John Leuner
(add (ccrs c) (setf (ecx c) (make-instance
'register :register-class :I4 :name :ecx :encoding #x41)) )
(add (ccrs c) (setf (edx c) (make-instance
'register :register-class :I4 :name :edx :encoding #x42)) )
(add (csrs c) (setf (ebx c) (make-instance
'register :register-class :I4 :name :ebx :encoding #x43)) )
(add (csrs c) (setf (esi c) (make-instance
'register :register-class :I4 :name :esi :encoding #x46)) )
(add (csrs c) (setf (edi c) (make-instance
'register :register-class :I4 :name :edi :encoding #x47)) )
(setf (esp c) (make-instance
'register :register-class :P4 :name :esp :encoding #x44))
(setf (ebp c) (make-instance
'register :register-class :P4 :name :ebp :encoding #x45))
(setf (cx c) (make-instance
'register :register-class :I2 :name :cx :encoding #x21))
(setf (cl c) (make-instance
'register :register-class :II :name :cl :encoding #x11))
c)
You can see that 32-bit registers are marked with the #x40 bit, 16-bit
with #x20, and 8-bit with #x10
John
John Leuner
2008-12-01 07:43:48 UTC
Permalink
Post by Kjell Godo
It seems like idc is generating a jolt-berg file which
contains a compiler which has a ReductiveGramar
No, idc is a compiler for a Smalltalk dialect that outputs C code.

jolt-burg is a program written in this Smalltalk dialect that can
compile a sexp-based language into pseudo-instruction trees. These trees
are processed by a reduction grammar (which uses a pattern matcher) to
produce machine code.

The machine code can be emitted as ascii assembler (for static
compilation) or it can be emitted as binary code directly to memory (see
asm-i386.h).
Post by Kjell Godo
part that matches psuedo code trees and outputs
ascii assembler which is compiled by a gcc assembler
and that binary machine code winds up in memory
if dynamic compilation is happening. Are you directing
the gcc assembler to do flat assembly into a file and
then reading that file into RAM memory?
There seem to be lots of compilers generating other
compilers in there. I would like to understand this
sort of thing.
It's probably not as complex as you think.
Post by Kjell Godo
Are there any books or papers that tell about what you
are doing? ReductiveGramars and etc.
There aren't really books or papers about that, but it's really just a
simple pattern matcher.
Post by Kjell Godo
What exactly is your Common Lisp project about?
Is it a port of fonc Pepsi Cola etc to Lisp? Or
is it something different.
My project started with a port of jolt-burg (and OMeta) to Common Lisp.

I have used these components to build two new programming languages
called Church and State.

http://subvert-the-dominant-paradigm.net/blog/?p=28
Post by Kjell Godo
If I want to try to understand the fonc compilers or your
Lisp versions, is there anyplace to start?
For fonc, it's difficult to say where to start, it depends what you are
interested in. Fonc is still under development.

My projects are also under development and I don't plan to write a
code-level introduction or tutorial until after I have completed the
bootstrap process.
Post by Kjell Godo
On Fri, Oct 17, 2008 at 3:40 AM, John Leuner
Post by Kjell Godo
(:reg (:early-cleanup :reg) ,#'(lambda (op cg)
(movl-reg cg (output (lhs op)) :reg
(eax cg))
Post by Kjell Godo
(emit-epilogue cg)))
?
This is a pseudo-instruction that takes the return value as an argument
and places it in the eax register. It then emits the cleanup code to
return from the function.
So when the rule generated by the above expression sees an
:early-cleanup op in the tree then it emits
movl eax, <rrr>
<epilogue>
where <rrr> is some register name. And a long is 32 bits.
<rrr> was allocated to receive the result of an instruction's
evaluation. And that result is to be returned in eax.
I would like to know how registers are allocated.
The register allocation is handled in these files:

http://subvert-the-dominant-paradigm.net/repos/hgwebdir.cgi/bootstrap/file/93e1996ad6a5/burg/resource.lisp
http://subvert-the-dominant-paradigm.net/repos/hgwebdir.cgi/bootstrap/file/93e1996ad6a5/burg/instruction.lisp
http://subvert-the-dominant-paradigm.net/repos/hgwebdir.cgi/bootstrap/file/93e1996ad6a5/burg/code-generator.lisp
Post by Kjell Godo
Post by Kjell Godo
What is the op and the cg = codeGenerator?
op is the early-cleanup instruction. cg is the code generator object.
I don't understand: (:reg (:early-cleanup :reg)
It's some kind of pattern matching but I don't get it.
Is (:reg some kind of function call that creates a rule?
I will use this example:

(:reg (:addi4 :reg :reg) ,#'(lambda (op cg)
(addl-reg cg (output (rhs op))
:reg (output op))))

The first :reg indicates that this rule produces an output in a
register. :void is used to indicate no output.

(:addi4 :reg :reg) will match against an instruction that has ":addi4"
as a name. In instruction.lisp you can see that each instruction has a
name.

(defclass ADDI4 (binary) ()) (defmethod name ((i
ADDI4)) :addi4)
(defclass ADDRFP4 (leaf) ()) (defmethod name ((i
ADDRFP4)) :addrfp4)
(defclass ADDRGP4 (leaf) ()) (defmethod name ((i
ADDRGP4)) :addrgp4)
(defclass ADDRJP4 (leaf) ()) (defmethod name ((i
ADDRJP4)) :addrjp4)
(defclass ADDRLP4 (leaf) ()) (defmethod name ((i
ADDRLP4)) :addrlp4)
(defclass ANDI4 (binary) ()) (defmethod name ((i
ANDI4)) :andi4)
(defclass ASGNI1 (binary) ()) (defmethod name ((i
ASGNI1)) :asgni1)

The next part of the pattern is :reg :reg, which indicates that the lhs
and rhs of this instruction must produce a result in a register. The
matching and reduction code is in instruction.lisp.
Post by Kjell Godo
Post by Kjell Godo
What is ccrs standing for?
call-clobbered registers
These are registers whose values are not preserved after a CALL/RET. The
call-saved registers (ebx, esi, edi) must be saved by a function before
it uses them, and the values restored before returning.
call saved registers is csrs then. Are the call saved registers just saved
by your program or must they be saved by all programs? If so then why?
This is a standard convention for x86 code. There are different
conventions for different processors and operating systems.

http://en.wikipedia.org/wiki/Calling_convention
Post by Kjell Godo
Post by Kjell Godo
(add (ccrs c) (setf (ecx c) (make-instance
'register :register-class
Post by Kjell Godo
:I4 :name :ecx :encoding #x41)) )
Why is the encoding hex 41? Why isn't it just 1?
(defmethod initialize-instance :after ((c i32-code-generator)
&key)
(add (ccrs c) (setf (eax c) (make-instance
'register :register-class :I4 :name :eax :encoding #x40)))
So it looks like you are making a register and setting it into the eax
slot in c and also adding it to a set or list of call clobbered registers
also in c. call clobbered registers are registers that can get clobbered
by calls. :l4 is 4 bytes long. :l2 is 2 bytes long. :l1 is 1 byte long.
:P4 is a 4 byte pointer. This method is evaluated after the
initialize-instance method.
Yes, this code describes the registers available on the x86
architecture.

John

John Leuner
2008-10-17 11:23:08 UTC
Permalink
Post by Kjell Godo
I looked at the jolt-burg directory
There seems to be a static code generator and a dynamic code generator.
Does the dynamic code generator spit out binary?
It generates code into memory at runtime
Post by Kjell Godo
The code generator that I was looking at seemed to be outputting ascii
assembler source code that I assume would then be sent to some assembler
like gcc or something.
Yes
Post by Kjell Godo
What is a ReductiveGrammar ?
It's a set of rules for matching over a tree of pseudo-instructions and
generating machine instructions from them.
Post by Kjell Godo
I can't say that I was able to make heads or tails of much of anything
in there though. Nothing is commented. I wish there could be some
long comments in there that would explain what was going on. It all
looks very high tech. I would like to understand it all but I can't.
yet.
How about if the code was in HTML or XML and the comments were put
into popup windows that you could click on? That way you could put
much explanations in there without gumming up the code with a lot of
comments.
The style seems to be jillions and jillions of one line methods.
The C stuff is very hard to read. I just kind of stare at it like a
hedge hog in the head lights.
That is generated code, it is generated by the idc compiler.
Post by Kjell Godo
Post by John Leuner
I have also ported this code generator to common lisp, have a look at
http://subvert-the-dominant-paradigm.net/repos/hgwebdir.cgi/bootstrap/file/b87b87521259/burg/
Okay, I will try to look at this too.
Can I run it in any free Windows based Lisps?
No, at the moment I only run it on SBCL on linux
Post by Kjell Godo
I would like a little guided tour test suite that would go from
easy examples to hard ones that show how the code generator
works. In a stepping source code debugger like in Smalltalk.
But Lisp doesn't seem to have that. And then in each method
along the way I would like a little explanation of what it is doing.
Along with inspectors to see what the data looks like.
Maybe one day there will be guided tours of these projects, but right
now it is too early for that and they are still changing too quickly.
Post by Kjell Godo
In other words I would like something just like the picoLARC
project on sourceforge.net . But I don't think I'm probably going
to get it. Oh well, that's okay.
How do I make heads or tails of this stuff?
Perhaps you should start with a small part and try to understand that?

John
Continue reading on narkive:
Loading...