Follow

TIL: many compilers allowed identifiers of any length, but only examined the first 8 characters to determine what the identifier pointed at.

So `foobarbaz` and `foobarbazqux` are both valid, but refer to the **same** variable!

What I _didn't_ learn was why anyone thought this was a good idea. I assume it was for performance reasons, since Free Pascal doesn't do it anymore (wiki.freepascal.org/Identifier) and 8 makes me guess it has something to do with memory representation.

Anyone know?

@codesections

It was so the compiler and linker didn't have to worry too much about identifier length, yeah. C also had this limitation during that era. That's why functions in the std library are named 'strcat', 'memcpy', etc.

Neither C nor Pascal compilers have had this limitation for decades.

@codesections Did you know that C compilers are still only required to support up to 4095 characters on a line of source code? If you have a line of code in your C program that's longer than that, the standard doesn't guarantee it has to be compiled correctly :)

@cancel

> Did you know that C compilers are still only required to support up to 4095 characters on a line of source code? If you have a line of code in your C program that's longer than that, the standard doesn't guarantee it has to be compiled correctly :)

I did _not_ know that! And it's per physical line, not per statement? So using a javascipt-style minifier for C source code and having a 1-line program would make the entire thing undefined?

@codesections Sort of :) If the compiler you're using says it's OK to do it in the manual, then it's OK to do it. gcc and clang and msvc could all handle it just fine. I'm not sure if tcc could, though.

@cancel

> Sort of :) If the compiler you're using says it's OK to do it in the manual, then it's OK to do it.

Yeah, fair. I guess I should have said "implementation defined" instead of "undefined"

@cancel when coding rules limits you to 80 characters per line, you don't have any problem with compiler. 🤣 @codesections

@equals_w_equals @cancel @codesections FORTRAN 77 sets the limit at 72, and you also have to spend 8 characters on boilerplate at the start of each line :)

@cancel@merveilles.town @codesections@fosstodon.org Meanwhile, 1980s BASIC would often, e.g. the one on Commodore machines, allow arbitrary length variable names, but only consider the first two (2) characters significant!

@cancel

> It was so the compiler and linker didn't have to worry too much about identifier length, yeah. C also had this limitation during that era.

Interesting. Was that just about speeding up compilation, or were there other benefits?

@codesections It makes the compiler more simple. char name[9] is a lot easier to deal with than some kind of string interning and pooling system. And these systems didn't have much memory by today's standards. A significant portion of it would be taken up by the identifiers themselves during linking.

@cancel is there any significance to names using 6 characters? I see this convention used a lot in older languages. The MUSIC N languages used 6-letter opcodes for example.
@codesections

@paul @codesections The original Fortran was limited to 6 characters for identifier names. Maybe just copied from that?

@cancel @paul

Honestly, having a hard/compiler-enforced length limit that causes an error seems *way* better than silently throwing away the end of the identifier and merging (what the programmer thought were) two different variables

@cancel yeah that seems likely. Did they need to be exactly 6 characters?
@codesections

@codesections Also true of many FORTRAN and C compilers, except the limit might be as low as 6 chars. Storing arbitrary-length strings isn't just more memory, but dynamic allocation. A fixed-size string is easier, especially if your compiler may itself be written in ASM.

Last compiler I used with a name length limit was Sozobon C (no bozos) ~1990. It's not an ancient practice.

@codesections IDK, but I suppose it was only true about the same time when FORTRAN IV was a thing (late 60's - early 70's). And even FORTRAN 77 has way too many strange design decisions like this. Memory then was really scarce and expensive, so that even compiled languages per se weren't usually considered a good idea. Also I guess nobody here knows what it was like to debug vacuum tubes electronic computer, huge oven where the joints may spontaneously desolder...

@codesections P.S. AFAIK, the "speed" back then was not even considered, the limitation was the memory footprint.

@codesections The goal of course is to help write unmaintainable code :)

Sign in to participate in the conversation
Fosstodon

Fosstodon is an English speaking Mastodon instance that is open to anyone who is interested in technology; particularly free & open source software.