My concept of T-expressions is roughly filling the same role that 'skeleton trees' (or rather, informal use case of vectors used to implement them) seem to be in Magritte. A concrete syntax tree.
It's only slightly more expressive than a raw vector, and only in the sense that it adds the equivalent of 'cdr' to a vector, and formalises the informal pattern of 'first cell describes the data type'.
I think this extra information is important, but that's the hardest idea to sell.
Very interesting thread, thanks for posting! A couple of thoughts:
I really wonder if #racket will get popular enough to trigger an even bigger Cambrian explosion. It's certainly what they're going for
> maybe arrays beat (ie, are a more universal storage abstraction mechanism than) conses.
Any thoughts on #apl? Not minimal syntax, but similar ideas
Right, it seems to me that the basic Von Neumann machine model is 'array/vector of unsigned integers of some word size' and then we go from there.
I guess what I want to know is:
If we're building our own virtual machine that doesn't start from, say, the POSIX environment - what's the smallest model we can get away with?
PicoLisp seems pretty small, but it builds in some nasty stuff (eg evaluate an integer and you execute machine code of the attacker's choosing).
The 'serverless' people in the Cloud space are moving in on this territory. Amazon Lambda etc.
I'd like to see the non-proprietary, distributed-web community have an answer to what 'serverless function in the DWEB' might look like.
Either a very small Lisp or a very small Forth *might* be the right answer. Or, it might not. Forth in particular makes some very bad assumptions about security. And Lisp maybe doesn't play well with the C/POSIX memory layout. Dunno.
On 'universality being trivial':
Yes, and also no.
Anything TM-ish can emulate any other TM, yes. But that's not always the most relevant criterion for human usage of a computing system.
I suppose what I keep wanting is some kind of data-plus-computing substrate that can 'capture, share and remix most of human thought'. Something like a hyperlinked spreadsheet (data plus formulas/functions) but a bit more freeform.
SQL databases seem a bit too constrained.
The idea of homoiconicity I think is quite important. Without it, yes you can 'emulate' any data/rule set in any other data/rule markup language or format... but you can't always preserve it in the same 'shape'.
Preserving 'shape' (structure + syntax) seems important. Each time you transform data, you introduce the possibility of errors. So I think we'd like a format that introduces as little of its own structure as possible, ie, has a very minimal/general structure.
Raw S-exps are pretty good (maybe remapped from (.) to [|]\ because (.) collide with commonly-used characters in English writing). They turn a sequence of chars into a sequence of atoms and lists, which gets us a very long way.
I'm suggesting adding ` to get us one extra ability, a user-definable type or 'term' tag. So a fixed character set of [`]\ plus whitespace. From there, we get S-expressions 'plus terms' which is a slightly extended form of Prolog's data model.
Theoretically all we really need is to read/write arbitrary ammounts of data and write conditions upon them. In practice we need to figure out what that data is meant to mean.
Lisp's S-Expressions and Haskell's Abstract Data Types do strike me as trivial yet useful memory models though.
Mmm. I think I'm trying for a 'concrete data type' rather than an abstract one. Or a 'concrete parse tree' perhaps.
Objects or ADTs do have the nice property that if you've got one in memory, you know it was generated 'correctly' (at least for this run of the program) because only one object (the constructor) is allowed to create them. But... this property gets VERY murky once we look beyond one single runtime and bring disk storage and network transmission into play.
A fairly universal definition of 'object' seems like: 'a bunch of sequential storage cells beginning with an object type/class/prototype/identifier, which is probably itself a storage address pointing to an object that describes it'.
Beyond that, it gets very hard to agree on semantics. So if we could get at least a readable/writeable version of 'type-labeled chunk of storage', that seems a bit of a win.
Working out how to break cycles and uniquely-name identifiers.
Also things like: If an object requires a 'context' (another object) to be interpreted, find a way to automatically locate that context object and bring it into local storage.
Where 'context' might include things such as: class/prototype of an OO object; symbol table of a bunch of Lisp cells; libraries of a linked executable; environment of a closure.
Lua's idea of 'metatables' seems relevant and useful. It comes so very close to being a universal language.
I *think* Racket's idea of 'syntax' objects is somewhat like a cons marked up with a context?
I'd like to see something like that, but universal.
It probably needs to be slightly less granular than at the cons level, though; otherwise we triple our storage cost. Which is why I keep thinking about type-tagged arrays/vectors (ie very lightweight objects) as more universal than conses.
When I say 'triple the storage cost': an absolutely trivial and dumb universal storage layer would be a triple store. Very much like RDF. Every cell in the entire RAM space being a triple of
(type, car, cdr)
We probably don't want that. Wastes a huge amount of storage. But I think it would be the most general and universal possible data-management system, that removes some ambiguities that cons storage has and is more extendable.
We could do Type with one bit, though.
I think most Lisps do this, or similar (PicoLisp steals the bits from the pointer to the cons, not from the car/cdr of the cons itself). But they tend to only allow built-in type tags, and they don't have a generally agreed-on extensible mechanism for reading/writing user-defined type tags to a serialised ASCII representation.
I think we need that, otherwise if you have in-memory types that can't be 100% correctly serialised, you don't really have a data storage layer.
And that's maybe the crux of my general feeling of unhappiness with the current leading edge of computer science and programming languages theory (Haskell et al and the type people):
I think they're focussing so much on 'correctness of computing / computability' that they have lost sight of 'correctness of *data import/export fidelity*'. And I think data is more important than computation.
Data outlives computation and passes through multiple computing machines.
The type theory and big compilation up front people are pushing so hard on 'restrict the allowable / expressible forms of data so that computation on that data is guaranteed to be correct'
But if you restrict the expressible forms of data, you introduce data loss. Data is what it is. It is primary. *Data is why we have computers*. Computation needs to be seen as a secondary thing that MUST adapt itself to the shape of the data - not the other way around.
And this feeds into a political issue:
1. Users create data. Their needs must be primary.
2. Programmers create only computation. Their needs are secondary to those of the user.
3. The programmer - the app developer, the database administrator, the server host, the business manager, the IT oligarch - does NOT and should NOT outrank the user.
4. But today's models and mindset of 'correctness' and 'best practice' in both programming theory and business puts 2 over 1.
It's basically the same problem that George Orwell saw when he satirised similar trends in linguistics as 'Newspeak'. The idea that you could 'eliminate error from communication' by strong definition and enforcement of allowable speech up front.
There will always be important data that end-users create that just doesn't fit the currently fashionable data schemas maintained by experts and big organizations.
We need to remember this and not pre-code such data as 'error'.
This also folds into my unhappiness with the Object-Oriented Programming model:
1. OOP says that all data should be considered 'objects', ie, computing machines
2. Computing machines are 'dangerous' in various ways (halting problem, memory security, etc) and so they must be restricted
3. But if you insist on modelling data as objects, then you insist on modelling a simple and safe and general thing as a complex and dangerous and restricted thing
4. Therefore, it would be better that we model objects and other computing machines (when we need to) as data, rather than data as objects. To do otherwise is to invert the abstractions, modelling a simple thing as a complicated thing, and that just invites errors of all kinds.
5. However, modelling objects as data violates one of the core rules of OOP: that objects be opaque 'black boxes' whose state cannot be accessed by the user.
6. There's a deep mismatch here.
I think runtime type tags on data are extremely important, so if that's what you mean by 'types' then I'm all for it.
The part of type theor[ies] that I dislike is the part where 1) all the type information is compiled out for efficiency so it's not available at runtime, and 2) the type annotations (and the definitions of types) are written in a language which is, itself, not allowed to be represented by data at runtime.
Mostly (2) is what I don't like.
As for 'cubical types' and homotopy type theory, I have no idea what that is, so I'm somewhat curious.
but frankly, I think every successful type theory that can be represented by data at runtime, and where new types can be generated by programs (that obey that type theory) at runtime (and if a type theory can't do both of those two things then I think it's insufficient)..
I think every such theory will look more and more like 'literally Prolog' eventually.
Basically my main annoyance is that TYPES HAVE TO BE READABLE AND WRITEABLE AT RUNTIME DARNIT
because (in the case of a single Internet-spanning language, which is what I want to see) it's never NOT runtime. And there's no language 'outside' that language (otherwise you'd need infinity languages).
so types that are creatable at runtime mean that types have to be data that can be operated on by functions.
If a type theory agrees with this, then I'm for it.
Once the Haskell and other strict-typing languages community absorbs the type-theory equivalent of what 'prototype OO' means to OOP... then we'll see some progress.
Oh, and one other observation:
'Types', as in, data that contains type tags...
... those are going to outlive 'type theories' in exactly the same way that ordinary data outlives algorithms. Theories fade, data endures.
We'll always need a 'date' type, even though there might be a zillion different ways of reasoning ABOUT that date type.
So we need a way of thinking about 'types' that doesn't privilege any particular type 'theory' (or proof procedure).
I feel like Prolog got a lot of things right (not all of them, but a lot), and part of what it got right came from its background as part of a project in *linguistics* more than logic.
The main one being a very relaxed approach to the idea of 'logical statement' which is very tolerant of statements which simply don't have - or even need to have - a fixed definition known at compile time.
I think maybe that embracing that sense of ambiguity is very important for always-on and always-connected systems.
Data is always flowing on the network, and it doesn't always have to be every program's business to understand all of it. Also, definitions will change over time, as will data flows.
We need logic-like systems that are comfortable with just not knowing some things.
Well, what I've been is, if the language is bootstrapped, then you can embed it within itself, and manipulate the runtime state from the "outside".
Or you can just use Maybe everywhere, and not make everything total.
That doesn't have to be a burden, because if the syntax is flexible/extensible enough, you can hide the Maybes.
Yep, I think it's very important that a language - *and* its typesystem, if it has one - be 'bootstrappable' in the sense that it must be able to embed itself in itself.
I just don't see 'bootstrappable typesystems' as being a big thing in the current type theory agenda. In fact I think reflection and self-referentiality is seen as pretty much evil by the type theory people.
No, a 'special' 'embeddable' version of the language is not the answer I want.
Do this, but for types, and you'll have something.
Otherwise you'll need an infinite stack of embeddings.
The answer I keep coming back to is:
A type tag is a logical statement.
A type theory is a set of logical statements.
In the logic programming paradigm, *everything in a computer* is just a logical statement.
We're making this way harder than it needs to be.
Just... consume and emit logical statements. That's all you need! Maybe one implication operator *and that's it*. Everything else is libraries.
@codesections @natecull @alcinnz
I'm pretty noob at type theory in general, sooo. Idk, I'll try.
Basically cubical types let you say that there are various paths between types annnd that lets you transport things across types.
This is prooobably a decent introduction to them?
@codesections Racket is very cool, but for some reason I keep bouncing off it. In particular it just seems very difficult to get up to speed on the Racket concepts of 'language' and 'syntax'. I should try again at some point.
My original desire was for a game DSL, which is why it needed to be 1) small and 2) absolutely safe (don't want people pasting a neat game file off the Web and whoops it erases their hard drive). So how to *remove* Racket features is what I find tricky to learn.
Fosstodon is an English speaking Mastodon instance that is open to anyone who is interested in technology; particularly free & open source software.