It all started with the realization that most of the transistors in a modern computer are just sitting idle 99% of the time. Everyone focuses on the CPU, with a single instruction stream being read and executed at a blinding speed. Everything optimizes the CPU, thus RAM is designed to be able to read a cell quickly, when needed, but is idle otherwise. This has advantages, including until very recently, the lack of heatsink requirements for memory chips. The laptop I write this entry on has 512 megabytes of RAM, which means there are 4 Billion transistors which are just sitting there, waiting for their turn in the computing process. Meanwhile most of the 30 million or so transistors in the CPU are also dedicated to the "cache", which is a faster RAM, also present to help optimise the flow of data to and from the CPU.
At the heart of everything in the CPU, there are registers (another form of RAM), and the ALU (Arithmetic Logic Unit), Execution pipeline, instruction decoders, and assorted hardware. There are also specialized hardware optimizations such as "speculative execution" which literally computes advances the program along both forks in the instruction flow, until well after it is known which one is the proper path, in order to know the correct answer that much faster.
I can't emphasize enough how everything is optimized to get data to the CPU, through the CPU, and back out from the CPU. Its all about getting flow through the core as quickly as possible. The problem is that the core can only get so fast... because it always has to wait to talk to RAM.
The next logical step has been known for a very long time (before 1970?), parallel processing. I'm surpised its taken as long as it has for the dual core chips to show up. I expected them back in the early 1990s. They will help performance, and push things along because there will be multiple tasks happening while the CPU waits for the next set of data. Its not optimal though, because the programming tools and operating systems all are still optimized for the single instruction flow, but this shift has been long anticipated, and much research has been done, so there are appreciable benefits to this "multi-core" approach. (Intel calls it Hyperthreading when done by certain Pentium 4 chips)
This is the future path most computing hardware will take. The model is mature, and universally understood. It entails a known set of trade-offs which I have summarized above.
The Bit Grid
Bitgrid is a radically different architecture. I see the first uses as a coprocessor, for doing a fixed task at very high speed. Tasks which might otherwise be dedicated to a custom hardware processor, such as signal processing, graphics, compression, etc. I've always figured it would make a really good radar signal processor, or text search engine.
The concept is pretty simple, but the implications are radical. The basic computing element in a BitGrid is a cell, which has 4 neighbors, up, left, down, right. For each of these neigbors, there is an input, and output, and a program. So, in summary a cell has
- 4 single bit inputs, one from each neighbor
- 4 single bit outputs, one to each neighbor
- 4 programs, one for each output (16 bits each)
- combine the 4 inputs from the neigbors into a nibble - n
- select bit n from each "program" i.e. up(n), left(n), down(n), right(n)
- output that bit to each neighbor
Interesting properties of the bitgrid.
Fault tollerance - A bitgrid chip doesn't need to be perfect. If a cell is bad, but doesn't lie near an edge, it can be routed around. Software to do this routing would eventually find it way into the normal set of tools associated with bitchip deployment. This should greatly reduce the cost of the chips, should they ever be manufactured in bulk.
Program transformation - A program for a bitgrid could be automatically rotated, compartmentize, and routed to best fit a given chipset. As above, I expect this software to be a basic part of any bitgrid deployment.
Duplexing - When I decided on a 4 in, 4 program, 4 out architecture, it seemed a reasonable choice. When modeling the circuit for a 8 bit programmable 2s complement generator, it became apparent that data flow in more than one direction across a chip was quite useful. I had the mode bit going down, with the carry bit going up the same column. In non-trivial applications, this could help boost the efficiency of utilization of the cells.
Inertia as default - When programming a grid, I've chosen a set of defaults for my tools that implement what appears to be inertia. Once a bit is in motion across the grid, it defaults to travel in that direction, without collision. The physics parallels are intriguing.
Interface as RAM - It would be quite desireable to make the bitgrid appear to be a RAM chip externally. The "program" store could appear as a conigous block of static RAM to the host processor. The IO pins for each cell could be similarly mapped. Thus the first top 8 input bits would just be written to at address 0, for example. The output bits would be a read from the same address. Interfacing to a host becomes trivial.
As I've stated before, the bitgrid idea goes way back with me, to the my brief stint as an Engineering student (Rose-Hulman, 1981-1982). I had read about, and studied a bit about bit slice processors, and came up with the idea of building a vast array of them. I wrote page upon page of different ideas, and tweaks to the idea, but then I let it all drop. (A bad habit I'm working to change)
I've talked with friends about it from time to time, but figured I'd never be able to do it because I don't have the resources or skills to get a chip built. Let alone the whole patenting process...
I've hit middle age, and am figuring out just what my life goals are. 43 things was a nice catalyst, and one of my entries was about the bit grid. (Another is a refactoring compiler, which will be discussed elsewhere). At this point my goals for the bitgrid are:
- Have at least one other person truely "get" the idea, so I have a collaborator
- Actually get a good technical discussion going about the merits of this architecture
- Determine the true value of the concept, where its niche in the world of computing is
- Build some software emulation programs to test out ideas
- Write some code to solve a non-trivial computing problem (for some reason FFTs keep coming to mind)
- Figure out just how fast and cheap the chips would be
- Get chips made (I don't care who does this, as long as it stays open source)
- Debug said chip designs
- Get good chips made
- Find real customers, get feedback, loop