[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [oc] Beyond Transmeta...
> Yes, that's true - I haven't looked at it this way...
> So if I understand you correctly you are trying to calculate each 'bit
> plane'
> as fast as possible (from its point of view). I suppose this could be
> theoreticaly
> done faster than calculating 32b operands even for sequential programs.
> But I am worried that programs would be extremely large.
Yeah, that is probably one big issue, except when the network is configured
to work like an X86 or RISC processor, then what you have is a large chunk of
memory being used to create the necesary hardware in the network itself to be
able to run the software in a normal serial manner, this would decrease
memory usage but also decrease performance (sound familiar :), the old fight
between memory and performance).
> In how many cycles can you execute this? How many instructions do you need?
> c = add32(a,b)
> e = add32(c,d)
Well in a network manner with a minimum of 8 1bit processors, it would be
like 32 clocks, less processors increase that. But the minimum that could be
required could be only 1clock (depending on what bits change), if only the
first bit changes, or if 2 clocks if only 1 bit changes, of course if the bit
causes a turn over causing many carries, will increase that. The amount of
instructions is 1 for the first bit, 3 for the second bit and 4 for every
following bit). Of course there are other ways to arrange a network, I
believe that one is the most parallel I could create, I may be able to create
an even more parallel one but it may not have much in performance gain, it
would be a kind of temporal diffrence in that the first initial pass could
have 63 instructions to be done in parrallel (2 for every bit except the
last), and every pass after that is 2 instructions. There are many ways to
configure a network of bits, and they all might have more benefits then
others.
> BTW: I am not such pesimistic guy trying to criticise everything. When we
> were
> developing or2k such 'comments' were very welcome.
Well, I was not so sure you were... Actually this conversation has been good,
there are some things I did not realize about this that were brought up
through this discussion. Like the multiprocessor way of viewing the network,
if it was not for the questions I would not have tried to look at things in
diffrent ways.
> I suppose you can link data back to loop start, can't you? Parallely you can
> detect
> whether loop should be finished. Of course this isn't normal equation
> anymore...
> But otherwise I don't see a problem here.
Oh, yeah... ha... I did not think of that. :)
> Yes that is true for basic blocks. But not for functions. Compiler would
> create
> separate network for each function (there are too many problems otherwise).
> You cannot link them dinamicaly together. I won't go in detailed
> explanation,
> even when detecting parallelity between functions there are certain
> problems,
> unless program (meaning of program) itself is modified. ILP here stand for
> inductive logic programing.
What I'm getting at though, is not to modify the programs source code, but to
compile it into a network and to shift the network around into a more
parallel program. What I think the way it would work would be to shift parts
of the network which compose a function into other functions, so that as you
shift them around they sort of lose them selves as a descreet and seperate
function and instead become an integrated part.
I'm not exactly sure what you mean by creating seperate networks for each
function. You could mean to create a seperate network for each function
call/usage, which I think you mean. For that, I would say that you do not
necesarily have to do it that way, you could create a network that acts like
a mini-highlevel processor (high level instructions), that will reduce the
amount of redundantly creating the function networks, by allowing the
function to be called in an high level instruction. Instead of having them
directly connected to each other they would act like many mini processors
connected together. If you want to think about it diffrently try thinking
about as though you could either have billions of 1bit processors working
simultaniously (virtually of course since they only work on bits that
change), or you could have a few 8bit processors (faked by the network) that
do various tasks within the system, or you could have even fewer 32bit
processors doing various tasks, or you could have one 64bit processor that
does everything (like a normal CPU), the latter taking up the least amount of
memory, while the 1bit network takes up the most memory. Its really scalable
environment, that allows you to create any kind of processor that is
necesary, it will turn your functions into a processor if it needs to, it
will basicly balance between consuming a lot of memory and resources to
taking very little memory or resources. If it was not for this discussion I
don't think I would have realized that.
Leyland Needham