As far as I can tell, I've been working on the BitGrid since the early 1980s. I recall finding notebook pages about it from those days. I started this blog back in 2004. Since getting Long Covid in 2020, and the accompanying brain fog (plus being over 40..50..60) I don't remember exactly how it started. In 2022, someone wrote a page about it on the Esoteric Languages Wiki, complete with some examples. Here are the things I've learned that I'm fairly sure of, and all the context I can give you.
First a bit of relevant computing history
When the ENIAC, one of the earliest computers, was built, programming it was a matter of connecting cables in plugboards, throwing switches on "function tables". This effectively wired up a very expensive special purpose computer. Programming could take weeks, but everything could then work in parallel at state of the art speeds, equivalent to about 500 floating point operations per second.
Then the von Neumann architecture was grafted onto it, and the speed was cut by a factor of 6, because all of the inherent parallelism was lost.
Modern FPGAs are optimized for lowest possible latency, which is why they have switching fabrics on them. This makes them expensive, which tends to drive the need for maximum utilization, which then drives adding "special features" like dedicated RAM blocks, Multipliers, etc.
In the end, you've got a very weird, heterogeneous mess. The hardware design languages that you end up forced into using do their best to push your design into all the nooks and crannies, but it's never simple, nor fast. Compiles can take hours, and you're always going to be worried about timing issues.
Much like the problems with the ENIAC that eventually required a sacrifice in run time performance, to make programming easier... the BitGrid makes similar horrible, but useful, tradeoffs.
I've been toying with alternative architectures since the 1980s as a hobby. Over the decades I've become convinced that the best architecture is a grid of cells, in a grid, like a chess board. Each cell would have 4 inputs and outputs, one for each neighbor, latches would then store the outputs of each, so that computation was spread across 2 phases. Every cell that was computing would see stable inputs, and there would be no race conditions.
However, this makes it hard to think about... it might be worth using the conceptually easier model of latching all of the inputs in parallel, and also latching all of the outputs, at least for the first version of a chip. It's equivalent, as far as I can tell, but slower, in run times.
Then there's the problem of memory, I started thinking about this about 6 months ago, as it seemed that I might actually be able to make an ASIC of the BitGrid.
In FPGAs, they add dedicated blocks of RAM, that you can aggregate in various ways. I rejected this approach, because it would add the von Neumann bottleneck, and ruin the homogeneity that makes BitGrid universal.
In trying to actualize the BitGrid design, I learned that the LUTs on FPGAs are effectively a string of D flip flops daisy chained so that the bits can be streamed in at device programming time, then run through a multiplexer. If you add some more logic around this string of flip flops, you can then use it as a serial RAM, without losing generality, nor introducing the von Neumann bottleneck. I called it Isolinear Memory back in September.