Is newer better? | Know your tools | Use The Force | Compiler Extensions |
Which one is the best for your project? The answer may well be several different ones, and possibly even not the most current version of any particular one either!
The 1987 vintage TOUCH took up only 3992 bytes. Given what TOUCH
does, and the almost 4K size, its pretty safe to assume that the
Turbo C TOUCH utility wasn't written in assembler code because an
assembler version of TOUCH would only be a few hundred bytes. So,
TOUCH is probably a tiny model C program.
The Turbo C version 1.5 and version 2.0 TOUCH's look to be identical
to the Turbo C 1.0 version for the most part. Their size was
identical.
Enter Turbo C++ version 1.0 though. Now we see a big change in the
little TOUCH utility! The Turbo C++ 1.0 TOUCH has grown by about 1K.
Do you suppose there could be 1K worth of new function or bug fixes
in this TOUCH? Not likely. A more plausable scenario is that someone
updated the version of the compiler TOUCH was being built with and
didn't pay attention to the code growth when it happened.
The Turbo C++ 1.0 version of TOUCH has also managed to blow the 2K
and 4K cluster boundries that the previous version was sliding under
by 104 bytes! Hmmmm....
Now along comes Borland C++ version 2.0, and we see that TOUCH has
managed to grow some again. Not too bad this time though, only 6
bytes more for the 1991 vintage TOUCH. The TOUCH that comes with the
Borland C++ version 3.0 compiler is the same size as this one too.
Along came Borland C++ version 3.1 and we see a different, and bigger
TOUCH again though. This one has grown by just over 400 bytes. Could
there possibly be 400 bytes worth of new function and/or bug fixes in
this version of TOUCH? Again, not likely.
The TOUCH that comes with the OS/2 Borland C++ version 1.5 compiler is
a real shocker. It makes the DOS/Windows compiler versions all look
anemic by comparason.
An even worse shocker is the TOUCH that comes with the Watcom version
10.0 C/C++ compiler -- 44,033 bytes. To be fair to the Watcom folks,
their WTOUCH does do a lot more than the Borland DOS or OS/2 versions.
It's also a "bound" 16 bit OS/2 1.X compatible executable
(which means the same executable can run under OS/2 protected mode or
DOS). The penalty for having this dual-mode capability in a program
normally runs between 12K and 20K depending on how much of the OS/2
dual mode library is being dragged in.
So, what the heck was going on with this DOS version of the Borland
TOUCH utility that caused it to grow from its original 1987
incarnation at 3992 bytes to the 5528 byte 1992 edition? It's
rather unlikely that anyone at Borland spent a lot of effort doing
major code modifications to the DOS version of TOUCH. After all,
TOUCH isn't the kind of utility that requires a lot of maintenance
work once it's up and running. It's also unlikely that there could
be enough operational code in the source for TOUCH that code
generation differences in the different compilers would account for
the difference either.
The difference has to be in runtime library and C startup code
differences between the different versions of the compiler. Later
versions of the Borland compiler are clearly carrying around more
baggage than the earlier versions were.
When we're looking to shrink down some program, this knowlege is
useful. Blindly using the latest version of some compiler may well
hurt rather than help us. If some utility or component of a system
is working OK with an older revision of a compiler, and that version
is producing smaller executables, then maybe it pays to keep the
older version of the compiler around for building those components
that will bennefit from it.
There's also some testing impacts to consider here as well. If by
using an older version of a compiler, a particular component of a
system stays the same from release to release, the test folks will
have a lot warmer and fuzzier feeling about it. Barring bugs
and/or behavior changes in new operating systems, that particular
component is always going to behave the same for you.
Balancing off the testing advantage is whatever problems are
involved in keeping different levels of compilers around in whatever
build system you've got. These aren't insurmountable problems, and
they're a one-time cost too. Once you get the multiple compiler
environment setup it's not a big deal.
By the way, I've got a whole raft of C/C++ compilers from several
vendors on hand, and I often find myself going back to the 1988
vintage Turbo C 2.0 compiler for doing DOS things because it
produces the smallest EXE's in many cases.
If you've got code that can be compiled by compilers from several
different vendors, or with different versions of a compiler,
try'em all out and see which one works best for particular
components of whatever you're working on.Is newer better? (maybe not in every case)
Let me give you a concrete real life example of what I mean here.
I'll use the evolution of the Turbo/Borland C/C++ compilers as a
demonstration. Every version of Turbo/Borland C/C++ since Turbo
C 1.0 has included a little utility called TOUCH. I'm sure you're
all familiar with what TOUCH does - i.e. not much! It just updates
the date and time on a file to be whatever the current date/time on
the system happens to be.
Compiler Size in bytes Turbo C 1.0/1.5/2.0 3992 Turbo C++ 1.0 5118 Borland C++ 2.0/3.0 5124 Borland C++ 3.1 5528 Borland C++ 1.5(OS/2) 15872 Watcom C++ 10.0 44033
/*---------------------------------------------------------- Sample function to demonstrate the effects of inlining of string functions. ----------------------------------------------------------*/ #include <string.h> char s1[10], s2[10]; void foo(int fast) { if (fast) { /* #pragma turns on inlining for strcpy() */ #pragma intrinsic strcpy strcpy(s1,s2); } else { /* #pragma turns off inlining for strcpy() */ #pragma intrinsic -strcpy strcpy(s2,s1); } } |
I compiled this test function with the Borland C++ version 3.1 compiler with the -c (compile only) and -S (generate an ASM file) options. Here's the relevent code that was generated. I've added some comments by the generated code that contain the byte counts for the code that was produced for the two different calls to strcpy().
_TEXT segment byte public 'CODE' ; ; void foo(int fast) ; assume cs:_TEXT _foo proc near push bp mov bp,sp push si push di ; ; { ; if (fast) ; cmp word ptr [bp+4],0 je short @1@86 ; ; { ;#pragma intrinsic strcpy ; strcpy(s1,s2); ; mov si,offset DGROUP:_s1 (3 bytes) mov di,offset DGROUP:_s2 (3 bytes) push ds (1 byte ) pop es (1 byte ) xor ax,ax (2 bytes) mov cx,-1 (3 bytes) repnz scasb (2 bytes) not cx (2 bytes) sub di,cx (2 bytes) shr cx,1 (2 bytes) xchg si,di (2 bytes) mov ax,ds (2 bytes) mov bx,ax (2 bytes) mov ax,es (2 bytes) mov ds,ax (2 bytes) mov es,bx (2 bytes) rep movsw (2 bytes) adc cx,cx (2 bytes) rep movsb (2 bytes) ; Total (39 bytes) ; } ; jmp short @1@114 @1@86: ; ; else ; { ;#pragma intrinsic -strcpy ; strcpy(s2,s1); ; mov ax,offset DGROUP:_s1 (3 bytes ) push ax (1 byte ) mov ax,offset DGROUP:_s2 (3 bytes ) push ax (1 byte ) call near ptr _strcpy (3 bytes ) pop cx (1 byte ) pop cx (1 byte ) Total (13 bytes) @1@114: ; ; } ; } ; pop di pop si pop bp ret _foo endp _TEXT ends |
Wow! There was a 26 byte difference between inlining strcpy() and not inlining it. If your code had 100 calls to strcpy() scattered around in various places and they were all inlined it's going to cost an extra 2.6K in code size. 2.6K is a lot of code. It might be the difference between being able to keep something as "small" or "tiny" model or being forced into "medium" model with multiple code segments.
So, don't just blindly throw a global switch on your compiler to inline things unless they really do need to be inlined for speed. Doing this kind of stuff globally can really pork out the code.
The profiler may even tell you that the hot spots in the code
aren't CPU bound at all, rather they're I/O bound or dependent on
some operating system supplied system call. If the profiling tool
reveals 90% of the program's time is spent in the C compiler's
fwrite() routine or in some system call that draws arcs in an OS/2
or Windows window, then inlining things isn't going to buy you much
at all because your code isn't the speed bottleneck.
When the compiler knows the length of the source string, it might
generate smaller code than by calling the C runtime library to do
the job.
In the previous example the compiler couldn't know the length of
the source string at compile time because it was a variable.
Suppose though it's a string constant? Let's check out how the
Borland version 3.1 C++ compiler behaves in a case like this one:
In this case, the length of the source string can betermined at
compile time because the source is a constant value. This allowed
the compiler to generate a lot better code for the intrinsic
version than in the previous example. Here's the code that got
generated for this experiment:
In this case both versions turned out to be the same at 13 bytes.
Clearly, the way to go in a situation like this one is to inline
the strcpy(). Whenever you have identical sized code sequences
where one is faster than the other, go for the speed because
it's free.
Suppose we're generating code for a 80186 or better CPU here and
can tell the compiler to use the added instructions those CPU's
implement? In that case, the compiler generated this code for
the call to the runtime library:
Interestingly, here the compiler was smart enough to use the PUSH
immediate instruction, but chose to use a 1 byte larger ADD SP,4
to clean the parameters off the stack rather than the the two
POP CX's that it used when generating 8086 code.
Is the portability trade off worth it? That depends on the
application and its intended use. If you're writing a TSR or DOS
device driver, then it's a fair assumption that the code isn't
likely to be ported to a mainframe or some machine with a CPU
that isn't 80x86 compatible.
One approach to the portability problem is to write and debug the
code using fairly portable techniques and then #ifdef sections
for a specific compiler. Within the 80x86 compiler world, most
vendors implement a set of runtime library functions called int86()
and int86x() for low level OS and BIOS interfacing. Among the
compilers that do implement these functions, their behavior is
usually quite similar -- often being identical. For all practical
purposes, these two functions are "standard" among 16 bit
C/C++ compilers for the PC. They have no analog outside the PC
world of course, but within it we can usually count on them being
present.
Compiling the code using small model with the old Borland Turbo
C 2.0 compiler yielded an EXE that was 4232 bytes in size.
However, the Borland compiler has some nice extensions that can
make the code smaller and faster than using the REGS union and
the int86() runtime library call. Turbo C has "pseudo
register" variables that represent the CPU's registers and
the ability to generate an interrupt call inline. Recoding
DETECT to take advantage of those implementation specific
features would look like this:
This version of DETECT uses both the register variables and the
inline generation of a software interrupt call when it's being
compiled by one of the Borland compilers (the Borland compilers
all predefine the symbol __TURBOC__).
Some nice things happened special casing the Borland compiler
for this program. The EXE that resulted dropped from 4232 bytes
down to 3992 bytes -- a 240 byte decrease. Notice that this
happens to be enough of a reduction that disk clusters will
be saved.
Running the DSPACE utility (from chapter 1) on the original
4232 byte EXE gave this output:
All we needed to save a 4K, 2K, 1K, and 512 byte cluster on DETECT
was 136 bytes. Special casing the Borland compiler got 240 bytes,
so in this case making that small change was a real winner when
DETECT was compiled with Turbo C 2.0 -- and the code is still going
to compile and run OK on all those other various compilers too
because the change was #ifdef'd into the code.
Obviously the majority of the 240 byte savings from special casing
of Turbo C comes from not having the int86() function linked into
the program anymore. However, some of it comes from an improvement
in the generated code as well.
Using the standard int86() method would require the compiler to
generate an assignment to a variable in memory, pass several
parameters to int86(), and then generate an comparason with a
memory variable to check the result.
The special case version will generate a much simpler set of
instructions. The assignment statement "_AX = 0x4300"
translates directly into a "MOV AX,4300h" instruction.
The "geninterrupt(0x2F)" statement translates into
an "INT 2Fh" instruction, and
the "if (_AL == 0x80)" test translates to
a "CMP AL,80h" instruction.
The older Microsoft C/C++ compilers don't implement pseudo
register variables and the ability to inline a software
interrupt call. However, they do implement the ability to
do inline assembler. Extending the DETECT program to special
case the Microsoft product as well as the Borland would look
something like this:
Current versions of the Microsoft compiler predefine
the "_MSC_VER" macro, so I've keyed the special case
code off that symbol. Before the special casing for the
Microsoft compiler the EXE that the version 8.0 compiler
produced was 5939 bytes in size. After the special casing the
size of the EXE had dropped to 5795 -- a 144 byte reduction.
Note that we had to introduce an intermediate variable here
because the Microsoft compiler doesn't implement the
pseudo-registers the way the Borland compiler does.
The Watcom C/C++ compiler's approach to inline assembler is quite
a bit different than the Microsoft compiler. Watcom has you
define a quasi-function called a "code burst".
The code burst is then expanded inline wherever it
is "called" in the code.
The advantage of Watcom's code burst scheme is that you get to
define the behavior of the inline code with respect to the
registers it may destroy, and the register(s) it passes back
return values in.
This allows the optimizer to know a lot more about how the
section of inline code behaves. This scheme should allow the
optimizer to produce better code. The Microsoft and Borland
compilers will disable certain optimizations in functions
containing inline assembler code because they don't have any
knowlege about the side effects from that code.
Using all the default options and compiling with small model,
the Watcom 10.0 compiler produced a 5756 byte executable for
the DETECT program. Now here's a version of DETECT that
incorporated some conditionalized code to special case the
Watcom compiler as well as the Borland and Microsoft compilers:
By the way, the Watcom compiler predefines the
macro "__WATCOMC__", so that's a convenient way to
detect the Watcom compiler. Compiling this new version of DETECT
that special cases the Watcom compiler produced an executable that
was only 4620 bytes. That's a savings of 1136 bytes!
If we were writing a TSR or device driver for DOS that had an
int86() call embedded in it somewhere, then special casing the
Watcom, Borland, and Microsoft compilers could pay handsome
dividends in terms of resident code size for the driver or TSR.
Obviously anyone writing and distributing a 3rd party library
would want to look at special casing the code for these
compilers too. The generic int86() way of doing things is
indeed more standard, but it comes at a price.
Attention to small details like this can give a library vendor
a competitive edge in the market. Attention to small details
like this can give your applications an edge against competitors
who were too lazy or unaware of the latent power in their tools
to do them.
For example, suppose we're calling some DOS Int 21 function that
returns with carry set when an error occurred. Using _FLAGS to
handle a condition like this is trivial:
Note that bit #0 in the flags register is the carry flag, so the
mask 0x0001 is masking off the value of the carry flag in the test.
The Borland compilers are smart enough to notice tests like this
one and special case the code they generate for them. The code
generated for an "if" statement, like in this example,
is going to resolve to a JC or JNC instruction -- you couldn't do
any better in a situation like this by writing it in pure assembler
code. If you're doing low level interfacing work with the Borland
compilers, this is a good one to have in your bag of tricks.
Here we've got the function bar() calling foo() and passing the
address of an "automatic" variable as a parameter. In a
case like this, the variable "x" will be allocated on
the stack. If these functions are compiled using large model, the
pointer being passed to foo() will be "far". The
Borland 4.52 compiler generated this large model code for the two
functions:
Notice the "les bx,dword ptr [bp+6]" instruction that
was generated in the foo() function. This is typical of
a "far" pointer reference in large model code. This
instruction is going to result in a segment register load which
is an expensive operation in protected mode code -- like under
Windows or a DOS extender. It's also sloshing a double word
around in memory which hurts on CPU's with a 16 bit data path
like the 386SX chips. For the call to foo() the double word
pointer also caused the SS register to be pushed.
If we could be assured that the foo() function would only be
called with pointers to variables that live on the stack then
we could change the declaration of the foo() function to look
like this:
This declares the "p" pointer to be a 16 bit near
pointer that will always have an SS: override applied to any
reference that uses it. Since the parameter passed to foo()
in this example does indeed live on the stack, the call to
foo() in the bar() function only needs to pass the offset of
the "x" variable. The code generated when the _ss
pointer change was made looks like this:
Notice how the costly loading of the segment register is gone in
the foo() function now. The code for the call to foo() is also
a lot simpler and faster now as well.
_ds, _cs, and _es pointers behave in a similar manner to the _ss
pointer we just examined. The only difference is in the type of
segment override the compiler is going to apply whenever the
pointer is referenced. These special pointer types can be a
powerful fine tuning tool for programs being built with the
Borland compilers. Having the ability to suppress the reloading
of a segment register in protected mode goes a long way towards
minimizing the usual speed penalty (about 25% versus the same
code running in real mode) associated with protected mode large
model programs.
In a plain DOS program using an _seg pointer is an easy way
to access a program's PSP (program segment prefix). Here's
an example of an _seg pointer being used. For output, this
program simply echos whatever parameters it was pass when it
was run.
One of the convenient properties of _seg pointers is they can be
combined with a "near" pointer when doing pointer
arithmetic. The result of adding a near pointer to an _seg
pointer is that a "far" pointer is generated. This is
being done in the call to putchar() in the example program.
Use "the force" Luke!
In this case, "the force" is a profiler or performance
analysis tool. Run the code under a profiler in what you expect
to be common user scenarios and let the tool tell you where the
performance hot spots are in the code. In those sections you may
want to pay the price for inlining. What the profiler is probably
going to tell you is that there's a fairly limited number of
routines in the code where all the hot action is occuring.
Everywhere else, just won't matter in terms of speed.
Sometimes inlining is smaller as well as faster!
I just got done telling you why you might not want to inline
strcpy() functions in many cases, well there's a fairly common
exception worthy of noting here.
/*----------------------------------------------------------
Sample function to demonstrate the effects of inlining
of string functions.
----------------------------------------------------------*/
#include <string.h>
char s1[10];
void foo(void)
{
#pragma intrinsic strcpy
strcpy(s1," "); /* this one is inlined */
#pragma intrinsic -strcpy
strcpy(s1," "); /* this one calls the runtime library */
}
_foo proc near
push bp
mov bp,sp
push si
push di
;
; {
; #pragma intrinsic strcpy
; strcpy(s1," ");
;
mov di,offset DGROUP:_s1 (3 bytes )
mov si,offset DGROUP:s@ (3 bytes )
push ds (1 byte )
pop es (1 byte )
mov cx,1 (3 bytes )
rep movsw (2 bytes )
Total (13 bytes)
;
; #pragma intrinsic -strcpy
; strcpy(s1," ");
;
mov ax,offset DGROUP:s@+2 (3 bytes )
push ax (1 byte )
mov ax,offset DGROUP:_s1 (3 bytes )
push ax (1 byte )
call near ptr _strcpy (3 bytes )
pop cx (1 byte )
pop cx (1 byte )
; Total (13 bytes)
; }
;
pop di
pop si
pop bp
ret
_foo endp
;
; #pragma intrinsic -strcpy
; strcpy(s1," ");
;
push offset DGROUP:s@+2 (3 bytes )
push offset DGROUP:_s1 (3 bytes )
call near ptr _strcpy (3 bytes )
add sp,4 (3 bytes )
Total (12 bytes)
Compiler specific extensions
Many C/C++ compilers have implementation specific extensions that
can be valuable for speeding up and shrinking code. Naturally
using any of these makes the code less portable than using more
standard features.Inline assembler code
Suppose we had a C program that detects the presence of an XMS
driver (like HIMEM.SYS). A "standard" approach to
accomplishing this using a 16 bit C/C++ compiler for the PC might
go like this:
/*------------------------------------------
DETECT.C
Detect an XMS driver in a "standard" way
(for the PC world at least).
This source code will compile and run OK
using many different C/C++ compilers for
the PC. I tried it with the Lattice 3.0
compiler. Turbo C 2.0, Borland C++ 3.1,
Borland C++ 4.52, Microsoft C++ 8.0, and
Watcom C++ 10.0. All work fine with the
same source code.
------------------------------------------*/
#include <stdio.h>
#include <dos.h>
int main()
{
union REGS regs;
int rval;
regs.x.ax = 0x4300;
int86(0x2F, ®s, ®s);
if (regs.h.al == (char)0x80)
{
puts("XMS driver is present");
rval = 0;
}
else
{
puts("No XMS driver present");
rval = -1;
}
return rval;
}
#include <stdio.h>
#include <dos.h>
int main(void)
{
union REGS regs;
int rval;
#ifdef __TURBOC__
_AX = 0x4300;
geninterrupt(0x2F);
if (_AL == (char)0x80)
#else
regs.x.ax = 0x4300;
int86(0x2F, ®s, ®s);
if (regs.h.al == (char)0x80)
#endif
{
puts("XMS driver is present");
rval = 0;
}
else
{
puts("No XMS driver present");
rval = -1;
}
return rval;
}
File [detect.exe] is 4232 bytes long
Cutting 136 bytes saves a 4K cluster
Cutting 136 bytes saves a 2K cluster
Cutting 136 bytes saves a 1K cluster
Cutting 136 bytes saves a 512 byte cluster
#include <stdio.h>
#include <dos.h>
int main(void)
{
union REGS regs;
int rval;
#ifdef _MSC_VER
char ALreturn;
#endif
#ifdef __TURBOC__
_AX = 0x4300;
geninterrupt(0x2F);
if (_AL == (char)0x80)
#else
#ifdef _MSC_VER
__asm mov ax,4300h
__asm int 2Fh
__asm mov ALreturn,al
if (ALreturn == (char)0x80)
#else
regs.x.ax = 0x4300;
int86(0x2F, ®s, ®s);
if (regs.h.al == (char)0x80)
#endif
#endif
{
puts("XMS driver is present");
rval = 0;
}
else
{
puts("No XMS driver present");
rval = -1;
}
return rval;
}
#include <stdio.h>
#include <dos.h>
#ifdef __WATCOMC__
extern char XMSpresent(void);
#pragma aux XMSpresent = \
"mov ax,4300h" \
"int 2Fh" \
value [al] \
modify [ax];
#endif
int main(void)
{
union REGS regs;
int rval;
#ifdef _MSC_VER
char ALreturn;
#endif
#ifdef __TURBOC__
_AX = 0x4300;
geninterrupt(0x2F);
if (_AL == (char)0x80)
#else
#ifdef _MSC_VER
__asm mov ax,4300h
__asm int 2Fh
__asm mov ALreturn,al
if (ALreturn == (char)0x80)
#else
#ifdef __WATCOMC__
if (XMSpresent() == (char)0x80)
#else
regs.x.ax = 0x4300;
int86(0x2F, ®s, ®s);
if (regs.h.al == (char)0x80)
#endif
#endif
#endif
{
puts("XMS driver is present");
rval = 0;
}
else
{
puts("No XMS driver present");
rval = -1;
}
return rval;
}
Some handy Borland specific things
_FLAGS pseudo register
All versions of the Borland compilers since Turbo C 2.0 implement
direct access to the CPU's flags register via a pseudo register
called _FLAGS. This can be an incredibly handy little device for
dealing with functions that give back error conditions in the
CPU's carry or zero flags.
#define CARRY_SET (_FLAGS & 0x0001)
/* blah, blah, blah,... */
geninterrupt(0x21);
if (CARRY_SET)
{
/* handle the error condition here */
}
_es _ds _ss _cs pointers
Another handy extension the Borland compilers implement is the
ability to declare a near 16 bit pointer type that is referenced
via a specific segment register. Suppose we have two functions
like these:
static void foo(int *p)
{
*p = 1234;
}
void bar(void)
{
int x;
foo(&x);
}
FOO_TEXT segment byte public 'CODE'
;
; static void foo(int *p)
;
assume cs:FOO_TEXT,ds:DGROUP
foo proc far
push bp
mov bp,sp
;
; {
; *p = 1234;
;
les bx,dword ptr [bp+6]
mov word ptr es:[bx],1234
;
; }
;
pop bp
ret
foo endp
;
; void bar(void)
;
assume cs:FOO_TEXT,ds:DGROUP
_bar proc far
enter 2,0
;
; {
; int x;
;
; foo(&x);
;
push ss
lea ax,word ptr [bp-2]
push ax
push cs
call near ptr foo
add sp,4
;
; }
;
leave
ret
_bar endp
static void foo(int _ss *p)
{
*p = 1234;
}
FOO_TEXT segment byte public 'CODE'
;
; static void foo(int _ss *p)
;
assume cs:FOO_TEXT,ds:DGROUP
foo proc far
push bp
mov bp,sp
push si
mov si,word ptr [bp+6]
;
; {
; *p = 1234;
;
mov word ptr ss:[si],1234
;
; }
;
pop si
pop bp
ret
foo endp
;
; void bar(void)
;
assume cs:FOO_TEXT,ds:DGROUP
_bar proc far
enter 2,0
;
; {
; int x;
;
; foo(&x);
;
lea ax,word ptr [bp-2]
push ax
push cs
call near ptr foo
pop cx
;
; }
;
leave
ret
_bar endp
_seg pointers
Another special Borland extension is the _seg pointer type.
_seg pointers are a type of pointer that maps directly to a
segment or selector value with an implied offset of zero.
An _seg pointer is a 16 bit variable. These are primarily
useful for saving space in situations where you would
normally have a "far" pointer where the 16 bit
offset part of the pointer is always going to be zero.
Using an _seg pointer in cases like this can memory sometimes
because a normal "far" pointer would be 4 bytes.
#include <dos.h>
#include <stdio.h>
int main(void)
{
char _seg *pPSP; /* an _seg pointer to our PSP */
char near *p;
unsigned char CmdLineLen;
/*
_psp is initialized by the Borland startup
code as being the segment address of this
program's PSP.
*/
pPSP = (void _seg *)_psp;
CmdLineLen = *(pPSP+128);
p = (char near *)129;
while (CmdLineLen--)
{
putchar(*(pPSP+p));
p++;
}
return 0;
}
Some handy Microsoft specific things
Based pointers
The P-code interpreter
tonyi@ibm.net
- Shut up and jump!
Last modified on Sunday, Dec 13, 1998
This page produced the old fashoned way - with a text editor