Omg it is SO REFRESHING to see language designers acknowledging that REPLs exist and that yes, indentation-sensitivity FAILS HARD IN A FIRE in a REPL

(from the Magritte language thesis, )

My concept of T-expressions is roughly filling the same role that 'skeleton trees' (or rather, informal use case of vectors used to implement them) seem to be in Magritte. A concrete syntax tree.

It's only slightly more expressive than a raw vector, and only in the sense that it adds the equivalent of 'cdr' to a vector, and formalises the informal pattern of 'first cell describes the data type'.

I think this extra information is important, but that's the hardest idea to sell.

Show thread

I think Magritte is on the wrong track with using % for lexical variables though. I'd stick with the familiar $ and use % only for dynamic variables, which should be a much smaller set.

Show thread

Very interesting thread, thanks for posting! A couple of thoughts:

> Javascript seems to be the current hothouse for language design… It's fun to see this Cambrian explosion of language concepts

I really wonder if will get popular enough to trigger an even bigger Cambrian explosion. It's certainly what they're going for

> maybe arrays beat (ie, are a more universal storage abstraction mechanism than) conses.

Any thoughts on ? Not minimal syntax, but similar ideas

@codesections @natecull On the arrays point, my understanding is that arrays are more efficient on modern hardware but they're not more universal. Universality afterall is trivial!

I mean machine code's underlying memory model is an array of numbers, though caches do add an (invisible) wrinkle.

@alcinnz @codesections

Right, it seems to me that the basic Von Neumann machine model is 'array/vector of unsigned integers of some word size' and then we go from there.

I guess what I want to know is:

If we're building our own virtual machine that doesn't start from, say, the POSIX environment - what's the smallest model we can get away with?

PicoLisp seems pretty small, but it builds in some nasty stuff (eg evaluate an integer and you execute machine code of the attacker's choosing).

@alcinnz @codesections

The 'serverless' people in the Cloud space are moving in on this territory. Amazon Lambda etc.

I'd like to see the non-proprietary, distributed-web community have an answer to what 'serverless function in the DWEB' might look like.

Either a very small Lisp or a very small Forth *might* be the right answer. Or, it might not. Forth in particular makes some very bad assumptions about security. And Lisp maybe doesn't play well with the C/POSIX memory layout. Dunno.

@alcinnz @codesections

On 'universality being trivial':

Yes, and also no.

Anything TM-ish can emulate any other TM, yes. But that's not always the most relevant criterion for human usage of a computing system.

I suppose what I keep wanting is some kind of data-plus-computing substrate that can 'capture, share and remix most of human thought'. Something like a hyperlinked spreadsheet (data plus formulas/functions) but a bit more freeform.

SQL databases seem a bit too constrained.

@alcinnz @codesections

The idea of homoiconicity I think is quite important. Without it, yes you can 'emulate' any data/rule set in any other data/rule markup language or format... but you can't always preserve it in the same 'shape'.

Preserving 'shape' (structure + syntax) seems important. Each time you transform data, you introduce the possibility of errors. So I think we'd like a format that introduces as little of its own structure as possible, ie, has a very minimal/general structure.

@alcinnz @codesections

Raw S-exps are pretty good (maybe remapped from (.) to [|]\ because (.) collide with commonly-used characters in English writing). They turn a sequence of chars into a sequence of atoms and lists, which gets us a very long way.

I'm suggesting adding ` to get us one extra ability, a user-definable type or 'term' tag. So a fixed character set of [`]\ plus whitespace. From there, we get S-expressions 'plus terms' which is a slightly extended form of Prolog's data model.

@natecull @codesections Yes, basically what I want to encourage with the phrase "universality is trivial" is to focus on the human factor.

Theoretically all we really need is to read/write arbitrary ammounts of data and write conditions upon them. In practice we need to figure out what that data is meant to mean.

Lisp's S-Expressions and Haskell's Abstract Data Types do strike me as trivial yet useful memory models though.

@alcinnz @codesections

Mmm. I think I'm trying for a 'concrete data type' rather than an abstract one. Or a 'concrete parse tree' perhaps.

Objects or ADTs do have the nice property that if you've got one in memory, you know it was generated 'correctly' (at least for this run of the program) because only one object (the constructor) is allowed to create them. But... this property gets VERY murky once we look beyond one single runtime and bring disk storage and network transmission into play.

@alcinnz @codesections

A fairly universal definition of 'object' seems like: 'a bunch of sequential storage cells beginning with an object type/class/prototype/identifier, which is probably itself a storage address pointing to an object that describes it'.

Beyond that, it gets very hard to agree on semantics. So if we could get at least a readable/writeable version of 'type-labeled chunk of storage', that seems a bit of a win.

Working out how to break cycles and uniquely-name identifiers.

@alcinnz @codesections

Also things like: If an object requires a 'context' (another object) to be interpreted, find a way to automatically locate that context object and bring it into local storage.

Where 'context' might include things such as: class/prototype of an OO object; symbol table of a bunch of Lisp cells; libraries of a linked executable; environment of a closure.

Lua's idea of 'metatables' seems relevant and useful. It comes so very close to being a universal language.

@alcinnz @codesections

I *think* Racket's idea of 'syntax' objects is somewhat like a cons marked up with a context?

I'd like to see something like that, but universal.

It probably needs to be slightly less granular than at the cons level, though; otherwise we triple our storage cost. Which is why I keep thinking about type-tagged arrays/vectors (ie very lightweight objects) as more universal than conses.

@alcinnz @codesections

When I say 'triple the storage cost': an absolutely trivial and dumb universal storage layer would be a triple store. Very much like RDF. Every cell in the entire RAM space being a triple of

(type, car, cdr)

We probably don't want that. Wastes a huge amount of storage. But I think it would be the most general and universal possible data-management system, that removes some ambiguities that cons storage has and is more extendable.

We could do Type with one bit, though.

@alcinnz @codesections

ie: if most (car, cdr) cells are understood to be pointers to at least a 16-bit storage space (probably 32 or 64-bit), then we can steal bits to represent some well-known Types, reserving one to say 'the car is the Type, cdr is the payload'.

@alcinnz @codesections

I think most Lisps do this, or similar (PicoLisp steals the bits from the pointer to the cons, not from the car/cdr of the cons itself). But they tend to only allow built-in type tags, and they don't have a generally agreed-on extensible mechanism for reading/writing user-defined type tags to a serialised ASCII representation.

I think we need that, otherwise if you have in-memory types that can't be 100% correctly serialised, you don't really have a data storage layer.

@alcinnz @codesections

And that's maybe the crux of my general feeling of unhappiness with the current leading edge of computer science and programming languages theory (Haskell et al and the type people):

I think they're focussing so much on 'correctness of computing / computability' that they have lost sight of 'correctness of *data import/export fidelity*'. And I think data is more important than computation.

Data outlives computation and passes through multiple computing machines.

@alcinnz @codesections

The type theory and big compilation up front people are pushing so hard on 'restrict the allowable / expressible forms of data so that computation on that data is guaranteed to be correct'

But if you restrict the expressible forms of data, you introduce data loss. Data is what it is. It is primary. *Data is why we have computers*. Computation needs to be seen as a secondary thing that MUST adapt itself to the shape of the data - not the other way around.

@alcinnz @codesections

And this feeds into a political issue:

1. Users create data. Their needs must be primary.

2. Programmers create only computation. Their needs are secondary to those of the user.

3. The programmer - the app developer, the database administrator, the server host, the business manager, the IT oligarch - does NOT and should NOT outrank the user.

4. But today's models and mindset of 'correctness' and 'best practice' in both programming theory and business puts 2 over 1.

@alcinnz @codesections

It's basically the same problem that George Orwell saw when he satirised similar trends in linguistics as 'Newspeak'. The idea that you could 'eliminate error from communication' by strong definition and enforcement of allowable speech up front.

There will always be important data that end-users create that just doesn't fit the currently fashionable data schemas maintained by experts and big organizations.

We need to remember this and not pre-code such data as 'error'.

@alcinnz @codesections

This also folds into my unhappiness with the Object-Oriented Programming model:

1. OOP says that all data should be considered 'objects', ie, computing machines

2. Computing machines are 'dangerous' in various ways (halting problem, memory security, etc) and so they must be restricted

3. But if you insist on modelling data as objects, then you insist on modelling a simple and safe and general thing as a complex and dangerous and restricted thing

@alcinnz @codesections

4. Therefore, it would be better that we model objects and other computing machines (when we need to) as data, rather than data as objects. To do otherwise is to invert the abstractions, modelling a simple thing as a complicated thing, and that just invites errors of all kinds.

5. However, modelling objects as data violates one of the core rules of OOP: that objects be opaque 'black boxes' whose state cannot be accessed by the user.

6. There's a deep mismatch here.

OO doesn't specify all data is built in an objects, only that abstract extractable data should be treated as such. Streams aren't technically objects persay
@alcinnz @codesections

@natecull @alcinnz @codesections What if you're smart about the conversions and use something like cubical types?

@natecull @alcinnz @codesections
Ok, read some other parts of the thread aannnnnd I'm not convinced that types would be bad for data.
Big compilation is bad, but not types.
Types can tell a Lot about what a function or data type actually is, which could let one do some cool metaprogramming.

@natecull @alcinnz @codesections
Let's say I wanna add a new column to a TSV file and to its corresponding type. I could tell the REPL to rewrite all existing usages up-front, so eg. pattern matches are re-written to ignore the new field.

@natecull @alcinnz @codesections
Or, if it's not automatic, one could go over all parts of the code that match the constructor and make sure they make sense.

And when they are done messing with code, the type info will let them compile it to something efficient.

@grainloom @alcinnz @codesections

I think runtime type tags on data are extremely important, so if that's what you mean by 'types' then I'm all for it.

The part of type theor[ies] that I dislike is the part where 1) all the type information is compiled out for efficiency so it's not available at runtime, and 2) the type annotations (and the definitions of types) are written in a language which is, itself, not allowed to be represented by data at runtime.

Mostly (2) is what I don't like.

@grainloom @alcinnz @codesections

As for 'cubical types' and homotopy type theory, I have no idea what that is, so I'm somewhat curious.

but frankly, I think every successful type theory that can be represented by data at runtime, and where new types can be generated by programs (that obey that type theory) at runtime (and if a type theory can't do both of those two things then I think it's insufficient)..

I think every such theory will look more and more like 'literally Prolog' eventually.

@grainloom @alcinnz @codesections


because (in the case of a single Internet-spanning language, which is what I want to see) it's never NOT runtime. And there's no language 'outside' that language (otherwise you'd need infinity languages).

so types that are creatable at runtime mean that types have to be data that can be operated on by functions.

If a type theory agrees with this, then I'm for it.

@grainloom @alcinnz @codesections

By comparison: Javascript trivially meets this standard because its equivalent of 'type' is 'prototype', and because prototypes are objects, all prototypes can be created at runtime. Except that a prototype isn't quite a type in the Haskell et all sense.

Once the Haskell and other strict-typing languages community absorbs the type-theory equivalent of what 'prototype OO' means to OOP... then we'll see some progress.

@grainloom @alcinnz @codesections

Oh, and one other observation:

'Types', as in, data that contains type tags...

... those are going to outlive 'type theories' in exactly the same way that ordinary data outlives algorithms. Theories fade, data endures.

We'll always need a 'date' type, even though there might be a zillion different ways of reasoning ABOUT that date type.

So we need a way of thinking about 'types' that doesn't privilege any particular type 'theory' (or proof procedure).

@natecull @alcinnz @codesections

Actually, it's more like "always enforcing big compilation is bad".

@grainloom @alcinnz @codesections

Yeah, I don't mind compilation, as long as 'compile' is a function I can call at runtime on arbitrary code-shaped data.

@grainloom @alcinnz @codesections

I feel like Prolog got a lot of things right (not all of them, but a lot), and part of what it got right came from its background as part of a project in *linguistics* more than logic.

The main one being a very relaxed approach to the idea of 'logical statement' which is very tolerant of statements which simply don't have - or even need to have - a fixed definition known at compile time.

@grainloom @alcinnz @codesections

I think maybe that embracing that sense of ambiguity is very important for always-on and always-connected systems.

Data is always flowing on the network, and it doesn't always have to be every program's business to understand all of it. Also, definitions will change over time, as will data flows.

We need logic-like systems that are comfortable with just not knowing some things.

@natecull @alcinnz @codesections

Well, what I've been :thounking: is, if the language is bootstrapped, then you can embed it within itself, and manipulate the runtime state from the "outside".

Or you can just use Maybe everywhere, and not make everything total.

That doesn't have to be a burden, because if the syntax is flexible/extensible enough, you can hide the Maybes.

@natecull @alcinnz @codesections

(and if your types are simple enough, maybe they can be deduced automatically)

@natecull @alcinnz @codesections
The embedded version would even let you inject special functions that let you do runtime introspection and monkey patching, but then of course it would _only_ work in the embedded version, augmented with the embedder.

@grainloom @alcinnz @codesections

Yep, I think it's very important that a language - *and* its typesystem, if it has one - be 'bootstrappable' in the sense that it must be able to embed itself in itself.

I just don't see 'bootstrappable typesystems' as being a big thing in the current type theory agenda. In fact I think reflection and self-referentiality is seen as pretty much evil by the type theory people.

No, a 'special' 'embeddable' version of the language is not the answer I want.

@grainloom @alcinnz @codesections

I keep pointing to Javascript and prototype OO as the right answer.

Does Javascript need as 'special embedded' version of itself to create a new prototype? No! You use the tools the language gives you to just create a new object.

Do this, but for types, and you'll have something.

Otherwise you'll need an infinite stack of embeddings.

@grainloom @alcinnz @codesections

The answer I keep coming back to is:

A type tag is a logical statement.
A type theory is a set of logical statements.

In the logic programming paradigm, *everything in a computer* is just a logical statement.

We're making this way harder than it needs to be.

Just... consume and emit logical statements. That's all you need! Maybe one implication operator *and that's it*. Everything else is libraries.

@grainloom @alcinnz @codesections

There already is a demo. It's nearly 50 years old. Everyone already uses it. It's called Prolog!

But seriously, fixing Prolog is annoying because it got so many things almost but not quite right and it's actually hard to find that next slightly better thing.

Show more

@natecull @alcinnz @codesections

It definitely is a big thing, bootstrapping Agda and type theory were the main research topics of my Agda teacher.

@natecull @alcinnz @codesections
But the typedness would let you write the libraries that don't change in the same language, and then you get (very) efficient compiled code.

@grainloom @natecull @alcinnz

> use something like cubical types?

Do you happen to have an explanation of cubical types? I'm not familiar with that term, and a Duck Duck Go search only turned things like "Equivalences can be transformed into equalities", which wasn't enlightening

@codesections @natecull @alcinnz
I'm pretty noob at type theory in general, sooo. Idk, I'll try.
Basically cubical types let you say that there are various paths between types annnd that lets you transport things across types.

This is prooobably a decent introduction to them?

@codesections Racket is very cool, but for some reason I keep bouncing off it. In particular it just seems very difficult to get up to speed on the Racket concepts of 'language' and 'syntax'. I should try again at some point.

My original desire was for a game DSL, which is why it needed to be 1) small and 2) absolutely safe (don't want people pasting a neat game file off the Web and whoops it erases their hard drive). So how to *remove* Racket features is what I find tricky to learn.

@natecull @codesections There is a sandbox egg for Chicken:

And Chicken's good at C integration, which is essential in games.

Sign in to participate in the conversation

Fosstodon is an English speaking Mastodon instance that is open to anyone who is interested in technology; particularly free & open source software.