Wherein Mike Warot describes a novel approach to computing which follows George Gilders call to waste transistors.

Thursday, March 17, 2005

BitGrid in 25 words or less

Get bit from each neighbor
concatenated 4 bit number becomes index
One lookup table per neighbor

Wednesday, March 16, 2005

Bitgrid pro and con - AKA the Thesis

If the Bitgrid is such a great idea, why haven’t I heard of it before?

One word answer: “Efficiency”

My review of the current literature, aided by my trusty pal Google has shown that the past 25 years of programmable/custom logic design is focused on serving one God, efficiency. All of the designs I’ve seen (admittedly a small subset because I’m not a professional circuit engineer) optimize on some or all of these common goals:

  • Speed
  • Power
  • Size
  • Circuit complexity
  • Unit cost

They do this for a very good set of reasons. You want the lowest power dissipation because it makes it easier to feed and clean up after. You want the fastest speed because that is the driving factor for using hardware instead of software. You want the smallest design size so that you use less die area, and have less chance of a losing a chip to a defect. The circuit complexity goal drives a huge investment in design tools to automate design tasks as much as possible.

The things that are often traded away for these goals in a chip design are:

  • Flexibility
  • Fault tolerance
  • Engineering costs

The primary reason for going with hardware in the first place is usually speed. If speed is not an issue, then it is usually a good idea to do a given task in software. Software is infinitely malleable, and far easier to patch and update.

Fault tolerance is usually excluded from designs of custom chips because it is difficult to achieve, and is better addressed by testing and quality control measures. Only when a given feature of a design is homogenous such as in RAM or ROM, is the option to include spares included in a design.

Engineering costs are usually considered last in a mass produced chip, but they are never trivial. The processes are optimized to automate as much of the design work as possible away from human engineers, but there are always going to be complexity limitations imposed by the heterogeneity of the elements in a given Programmable Logic / Custom ASIC design.

When viewed from the perspective of the design community I’ve observed, it becomes obvious why nobody has built a bitgrid yet… it’s inefficient as hell by their criteria. An insider would never seriously consider a bitgrid design.

That, gentle reader, is why you haven’t heard of the bitgrid before.

Reasons to consider the bitgrid

One word answer: “Efficiency”

Only in this case, a different set of parameters to optimize on:

  • Flexibility
  • Fault tolerance
  • Testing time
  • End users

The bitgrid is based on a single basic component, with a known, easily comprehended design, in an orthogonal grid. The homogenous nature of the grid makes it trivial to relocate a given portion of an application program.

It can route around a bad cell, if one is found. Each and every cell is a programmable wire at minimum utilization. As long as extensive faults are not present, and slack is available, it should be feasible to route around bad cells in almost any design.

Because it’s possible to test the RAM that stores the programming along with the bitgrid cells one at a time, it should be very easy to quickly and confidently test a chip with a minimum of testing equipment complexity.

Because of the simple nature of the bitgrid, a set of graphical tools for design and debugging can be built and will be applicable to any implementation of the chip. The parameters of the IO pins and array size are the main constraints to give to the tools. There are no design heterogeneity obstacles to complicate tool development.

There are, of course some bit trade offs made, including:

  • Speed
  • Latency
  • Power
  • Die Size
  • Unit cost

Compared to an ASIC, a bitgrid will always be slower, use more power, consume a larger die space and cost more per unit. A given bit will have to traverse at least 4 gates per cell, just to emulate a wire. The end user, will be compelled to expend some time optimizing their programming to fit in the smallest available bitgrid device applicable, of course.

Summary

I’m confident that as Moore’s law continues to drive down transistor prices, the bitgrid will become seen as a viable computing architecture for select applications, and may possibly mature into general use over time. The closest analogy I can evoke at this time is to the debates that were made with the introduction of high-level computer languages. It occurred when the cost of computation was driven below the cost of the programmers. I believe this transition is on its way for silicon.

I believe the best way to predict the future is to invent it. I hope you like my invention.

--Mike—

March 16, 2005

Friday, March 11, 2005

Why we need a Free Hardware Foundation

We have the Free Software Foundation as a result of Richard Stallmans irritation with closed source software. I propose a parallel organization to help promote and propagate open source, Free hardware. I want to be able to license the bitgrid under the equivalent of the GPL. I'm certain that there are others with ideas that they want to share in the same way.

Specifically, if my bitgrid idea actually pans out, I'd like to let anyone use it in a design according to an equivalent of the GPL for hardware. They would then be bound to do the same for their designs, so that improvements can get worked into the technology, and we all benefit.

The main threat I want to hold off is that of submarine patents. If we can build an open database of ideas which can be shared by all, it could be a very powerful tool. I'm not a lawyer, so I don't know if patent or copyright law would be the best place to try to build a GPL equivalent, but I'm sure its time to figure it out.

[update]
An analogy made in this post by Eric Ste-Marie in 1999:

On the other hand, the day that we can have a processor definition from the internet, download that definition in a "processor makng machine", add whatever material is needed, press a button, wait 5 minutes and "DING!" your processor is ready; well this day, maybe free hardware foundation will become popular and worth the effort. Until that time it will be a hobbyist thing and won't have any impact on the computer world compared to Free software.

Perhaps the bitgrid could fill that role by being virtual hardware?

Thursday, March 10, 2005

The bitgrid story

The Von Neuman architecture - Our current standard

It all started with the realization that most of the transistors in a modern computer are just sitting idle 99% of the time. Everyone focuses on the CPU, with a single instruction stream being read and executed at a blinding speed. Everything optimizes the CPU, thus RAM is designed to be able to read a cell quickly, when needed, but is idle otherwise. This has advantages, including until very recently, the lack of heatsink requirements for memory chips. The laptop I write this entry on has 512 megabytes of RAM, which means there are 4 Billion transistors which are just sitting there, waiting for their turn in the computing process. Meanwhile most of the 30 million or so transistors in the CPU are also dedicated to the "cache", which is a faster RAM, also present to help optimise the flow of data to and from the CPU.

At the heart of everything in the CPU, there are registers (another form of RAM), and the ALU (Arithmetic Logic Unit), Execution pipeline, instruction decoders, and assorted hardware. There are also specialized hardware optimizations such as "speculative execution" which literally computes advances the program along both forks in the instruction flow, until well after it is known which one is the proper path, in order to know the correct answer that much faster.

I can't emphasize enough how everything is optimized to get data to the CPU, through the CPU, and back out from the CPU. Its all about getting flow through the core as quickly as possible. The problem is that the core can only get so fast... because it always has to wait to talk to RAM.

The next logical step has been known for a very long time (before 1970?), parallel processing. I'm surpised its taken as long as it has for the dual core chips to show up. I expected them back in the early 1990s. They will help performance, and push things along because there will be multiple tasks happening while the CPU waits for the next set of data. Its not optimal though, because the programming tools and operating systems all are still optimized for the single instruction flow, but this shift has been long anticipated, and much research has been done, so there are appreciable benefits to this "multi-core" approach. (Intel calls it Hyperthreading when done by certain Pentium 4 chips)

This is the future path most computing hardware will take. The model is mature, and universally understood. It entails a known set of trade-offs which I have summarized above.

The Bit Grid

Bitgrid is a radically different architecture. I see the first uses as a coprocessor, for doing a fixed task at very high speed. Tasks which might otherwise be dedicated to a custom hardware processor, such as signal processing, graphics, compression, etc. I've always figured it would make a really good radar signal processor, or text search engine.

The concept is pretty simple, but the implications are radical. The basic computing element in a BitGrid is a cell, which has 4 neighbors, up, left, down, right. For each of these neigbors, there is an input, and output, and a program. So, in summary a cell has
  • 4 single bit inputs, one from each neighbor
  • 4 single bit outputs, one to each neighbor
  • 4 programs, one for each output (16 bits each)
The flow is simple:
  • combine the 4 inputs from the neigbors into a nibble - n
  • select bit n from each "program" i.e. up(n), left(n), down(n), right(n)
  • output that bit to each neighbor
The key is that once the programs are loaded, the bitgrid because a hardware virtualization of the logic embedded in the program. A very fast virtualization. All parts of the design run at the same time, there is no program counter, no need to optimize around a specific set of registers, etc. All of the cells can be in used if the programmer is efficient. (I've put this model into a spreadsheet that works well - though it was interesting getting around the circular reference restrictions once I tried to make a grid of cells)


Interesting properties of the bitgrid.
Fault tollerance - A bitgrid chip doesn't need to be perfect. If a cell is bad, but doesn't lie near an edge, it can be routed around. Software to do this routing would eventually find it way into the normal set of tools associated with bitchip deployment. This should greatly reduce the cost of the chips, should they ever be manufactured in bulk.

Program transformation - A program for a bitgrid could be automatically rotated, compartmentize, and routed to best fit a given chipset. As above, I expect this software to be a basic part of any bitgrid deployment.

Duplexing - When I decided on a 4 in, 4 program, 4 out architecture, it seemed a reasonable choice. When modeling the circuit for a 8 bit programmable 2s complement generator, it became apparent that data flow in more than one direction across a chip was quite useful. I had the mode bit going down, with the carry bit going up the same column. In non-trivial applications, this could help boost the efficiency of utilization of the cells.

Inertia as default - When programming a grid, I've chosen a set of defaults for my tools that implement what appears to be inertia. Once a bit is in motion across the grid, it defaults to travel in that direction, without collision. The physics parallels are intriguing.

Interface as RAM - It would be quite desireable to make the bitgrid appear to be a RAM chip externally. The "program" store could appear as a conigous block of static RAM to the host processor. The IO pins for each cell could be similarly mapped. Thus the first top 8 input bits would just be written to at address 0, for example. The output bits would be a read from the same address. Interfacing to a host becomes trivial.



My story
As I've stated before, the bitgrid idea goes way back with me, to the my brief stint as an Engineering student (Rose-Hulman, 1981-1982). I had read about, and studied a bit about bit slice processors, and came up with the idea of building a vast array of them. I wrote page upon page of different ideas, and tweaks to the idea, but then I let it all drop. (A bad habit I'm working to change)

I've talked with friends about it from time to time, but figured I'd never be able to do it because I don't have the resources or skills to get a chip built. Let alone the whole patenting process...

I've hit middle age, and am figuring out just what my life goals are. 43 things was a nice catalyst, and one of my entries was about the bit grid. (Another is a refactoring compiler, which will be discussed elsewhere). At this point my goals for the bitgrid are:
  • Have at least one other person truely "get" the idea, so I have a collaborator
  • Actually get a good technical discussion going about the merits of this architecture
  • Determine the true value of the concept, where its niche in the world of computing is
Assuming that I'm not crazy, then we move on to longer term goals:
  • Build some software emulation programs to test out ideas
  • Write some code to solve a non-trivial computing problem (for some reason FFTs keep coming to mind)
  • Figure out just how fast and cheap the chips would be
  • Get chips made (I don't care who does this, as long as it stays open source)
  • Debug said chip designs
  • Get good chips made
  • Find real customers, get feedback, loop
Right now... I'm just trying to find others who think the idea has merit, and want to sheppard it along. I thank Doc Searls for his encouragement, and I thank you for your time and consideration.

--Mike-

Wednesday, March 09, 2005

The bitgrid future

As I stated over at Herb Sutter's blog entry about concurrency, I think the future of computing will include the bitgrid, in some form. The Von Neuman architecture has its definite efficiencies, but the fact that a given RAM cell just sits there waiting almost 100% of the time, seems to me to be the ultimate in waste.

I look forward to the future, its going to be a wild ride getting there.

Sunday, March 06, 2005

Distilling a vision into something coherent is tough!

I've been working on my spreadsheets demos of the bitgrid concept. My latest efforts are online. Today's effort BitGrid-Learning01.xls is as small as I can make it (35k). I've also exported an HTML version, so you can get a sense of it, at least.

You see, each cell has 4 neighbors, above, left, below, right, each contributing one bit. This results in 16 possible values. I convert this to a 16 bit mask, which is a simple shift operation in assembler, and 2^x in a spreadsheet. I can then AND this with the 4 programs (one for each output), and simply output a 1 if the result is nonzero, otherwise 0.

In effect, the 4 bits are address, and the program for a cell is the input to a 16:1 data selector going to the appropriate output.

I'm hoping that the simplest possible example, made visible, can help transfer the concept to the reader. Its a lot more work than I thought, but very rewarding.

Thanks for your time.

--Mike--

Thursday, March 03, 2005

BitGrid emulation available

Earlier this week I had a flash of insight and decided to make a bitgrid emulator in a spreadsheet, and it worked. I was able to test out a few architectures, and learned that my gut feelings on the subject were on track. I've settled on a 4 input, 4 output cell, with each of the outputs being the result of a 16 bit table.

I've written friends, and solicited some opinions. I've put the emulator files online at http://basicsoftware.com/bitgrid/ Basically, they are fancy Microsoft Excel spreadsheets with some really nasty built in circular references from hell, but the do work. Please download the latest one, and play with it, see how this stuff works. Please let me know how I can make them more comprehesive and useful.

I've coded up an 8x8 cell grid, and demonstrated how to do 2 bit full addition, RS and D latches, and a 2s complement circuit. I'm amazed at just how much can be done with only 1 of the 8 columns of cells. It'll be interesting to see how tightly I can pack functionality into it.

Eventually, I want to build an FFT inside the grid, to see just how many cells it takes.

I'm also going to try to make a reasonable tutorial on programming this puppy. It's tedious at best, because it's all based on 16 bit numbers, which never seem to be in a easy to deal with order. I'm going to have to make some tools to simply the task.

I'm now at the point where the idea has a form, and I have to have enough faith in it to keep working on it. I've tended to abandon things at this point, so I'm really looking hard for some positive feedback on this one.

Thanks for your time
--Mike--

Wednesday, March 02, 2005

Preventing Patents?

It occurs to me that I have a good idea, but I really doubt my ability to successfully extract the almightly dollar from it, so I want to prevent others from patenting it?

What is the best way to release an invention to the public domain, to make sure others CAN'T patent it?

About Me

My photo
I fix things, I take pictures, I write, and marvel at the joy of life. I'm trying to leave the world in better condition than when I found it.