Control System Computer Architecture D2

No Comments

This blog post is a sequel to a blog post titled "Application Software Encapsulation Idea X1 and a Doomsday Computer Architecture Idea D1". The main difference between the Doomsday Computer Architecture D1 and the Control System Computer Architecture D2 is that the D1 is meant for general purpose computing, where it is OK for the program to allocate memory dynamically and crash due to the fact that the operating system hit the memory limits, but the D2 is required to work without dynamic memory allocation and, in the most strictest case, all of the programs of the D2 must work hard-real-time. Another way to put this is that the D1 is for mathematical experimentation, arbitrary precision calculations, where a human can re-design an algorithm, rewrite the program or modify its execution parameters, if the D1 runs out of memory, but the D2 must always use fixed precision arithmetics, because dynamic memory allocation is not allowed.




(The image can be seen in greater detail by clicking on it.)


Each CPU-module acts as a library of functions. In simpler case, plain C programming language functions. Recursion is not allowed in D2 software, with the exception of tail recursion, which has to be converted to a loop either manually or during build-time. Some forms of recursion involve multiple functions, like f1(...) calls f2(...), which calls f3(...), which in turn calls the f1(...). Those forms of recursion are also forbidden in D2 software.

The nodes in the full graph are not ordered globally, but between each pair of nodes the motherboard/cluster-software-settings/simulation-settings sets one of the nodes, semi-randomly, to have higher priority at initiating conversations. I call it PairMASTER. I call the node that has the lower priority at initiating a conversation PairSLAVE. Which of the nodes is set to be the PairMASTER and which of the nodes is set to be the PairSLAVE in any given node pair, is irrelevant, because the roles are set to be constant to avoid using atomic operation based locks for sorting this matter out. If there are N nodes in the complete graph, then each of the nodes, the CPLD of each node, connects to N-1 PairBUSes. Part of the real-time operation guarantees are implemented by linearly scanning a constant sequence of PairBUSes and by setting a deadline for servicing each PairBUS. The scheme, where a node picks one PairBUS, answers the question, function call, that comes in from that PairBUS, and then picks the next one in the constant sequence of PairBUSes, works with the assumption that recursion is forbidden, but the recursion is forbidden anyways, because recursion is forbidden due to the fact that dynamic memory allocation is forbidden.

There is no main(...) function. Instead, each CPU-module may have events, which execute some event handler (function) on that CPU-module. System wide grid-lock is avoided by putting "self" into the constant sequence of PairBUSes and by making sure that function call stacks of event handlers that originate from different nodes do not share any nodes. Due to the fact that nodes that are busy talking to each other are not available to other nodes, the easiest way to construct a hard-real-time system is probably by requiring that only one of the nodes is allowed to have events/event-handlers and that the execution of the event handlers is somehow determined by some carefully calculated rules. To speed the system up from soft-real-time point of view, the PairBUS has 2 extra wires, one for each node in the pair. Each node in the pair can indicate the other that it is busy serving someone else. That way, when the PairBUSes in the constant sequence of PairBUSes are scanned, the PairBUSes of the busy nodes can be skipped, jumped over.

The system can communicate with other computer systems by having one of the nodes act as a post office for asynchronous messaging. That post-office can also act as a gateway to the outside networks. In ships/cars/planes/trains the nodes should communicate with each other by using symmetric key encryption, one huge, multi-GiB-sized key per pair. The cost of storing such a key is just a few euros, yet it makes sure that the NSA crypto-attacks are going to be practically useless, specially, if the amount of communication is so low that the key acts as a one-time-pad for the very first few years of operation and even then the key can be slightly protected by salting, which has to be done anyway to hide metadata.

If the communication is encrypted anyways, then emergency routing through other nodes is OK. One of the functions in every node is to pass an asynchronous message from one node to another. Not perfect, but if some idiot cuts the cable between the plane engine and the pilots' instrumentation, then getting the commands from the pilots' instrumentation to the engine of the plane through a coffee machine, even with a slight delay, is a nice choice and probably allows a safe and smooth landing without the passengers even noticing. The nice thing is that due to strong encryption no bug is able to ask any of the nodes to execute any of its functions and as every PairBUS connects only 2 nodes, a single bug is not able to flood, DoS, the other nodes that it is not directly connected to. The 2 nodes that are effected, can automatically re-route their communication through other nodes that are connected to them without any bugged/cut/noise-flooded PairBUSes.

As of 2016_06_10 I'm really looking forward to trying this architecture out. I guess that it would be one hell of a fun thing to try out!!! Grid-lock-less and without a central main function and all of that optionally hard-real-time!!! Awesome!!!

++++++++++++++++++++++++++++++
I'll probably update, change, edit, modify this blog-post later.

Comments are closed for this post