• I32LP64 vs. ILP64 (was: VAX)

    From Anton Ertl@3:633/280.2 to All on Wed Aug 6 21:28:45 2025
    BGB <cr88192@gmail.com> writes:
    counter-argument to ILP64, where the more natural alternative is LP64.

    I am curious what makes you think that I32LP64 is "more natural",
    given that C is a human creation.

    ILP64 is more consistent with the historic use of int: int is the
    integer type corresponding to the unnamed single type of B
    (predecessor of C), which was used for both integers and pointers.
    You can see that in various parts of C, e.g., in the integer type
    promotion rules (all integers are promoted at least to int in any
    case, beyond that only when another bigger integer is involved).
    Another example is

    main(argc, argv)
    char *argv[];
    {
    return 0;
    }

    Here the return type of main() defaults to int, and the type of argc
    defaults to int.

    As a consequence, one should be able to cast int->pointer->int and pointer->int->pointer without loss. That's not the case with I32LP64.
    It is the case for ILP64.

    Some people conspired in 1992 to set the de-facto standard, and made
    the mistake of deciding on I32LP64 <https://queue.acm.org/detail.cfm?id=1165766>, and we have paid for
    this mistake ever since, one way or the other.

    E.g., the designers of ARM A64 included addressing modes for using
    32-bit indices (but not 16-bit indices) into arrays. The designers of
    RV64G added several sign-extending 32-bit instructions (ending in
    "W"), but not corresponding instructions for 16-bit operations. The
    RISC-V manual justifies this with

    |A few new instructions (ADD[I]W/SUBW/SxxW) are required for addition
    |and shifts to ensure reasonable performance for 32-bit values.

    Why were 32-bit indices and 32-bit operations more important than
    16-bit indices and 16-bit operations? Because with 32-bit int, every
    integer type is automatically promoted to at least 32 bits.

    Likewise, with ILP64 the size of integers in computations would always
    be 64 bits, and many scalar variables (of type int and unsigned) would
    also be 64 bits. As a result, 32-bit indices and 32-bit operations
    would be rare enough that including these addressing modes and
    instructions would not be justified.

    But, you might say, what about memory usage? We would use int32_t
    where appropriate in big arrays and in fields of structs/classes with
    many instances. We would access these array elements and fields with
    LW/SW on RV64G and the corresponding instructions on ARM A64, no need
    for the addressing modes and instructions mentioned above.

    So the addressing mode bloat of ARM A64 and the instruction set bloat
    of RV64G that I mentioned above is courtesy of I32LP64.

    - anton
    --
    'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
    Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>

    --- MBSE BBS v1.1.2 (Linux-x86_64)
    * Origin: Institut fuer Computersprachen, Technische Uni (3:633/280.2@fidonet)