DIAL: directed information assembly line

over the past few months I've been intermittently posting some snippets and fragments of ideas on a new information processing system. I've codified most of what I was talking about into a semi-formal protocol description. I think this has tremendous potential for wide applications, and I suspect many of these ideas below are already being used in many diverse contexts but have not been unified into a single specification (which I think will increase their value and use signficantly). I suspect something like the following is actually going to be a natural, inevitable evolution of the current "information infrastructure" and cyberspace-- i.e. something like the following is going to evolve whether I personally work on it or not, or whether future designers see this particular essay or not. anyway, I will try to incorporate comments from correspondents into future revisions. I would love to put together a group of people interested in continuing to develop this although that's probably premature at this point. thanks to everyone who has (unwittingly) contributed to this so far based on responding to my earlier essays. === DIAL = introduction = terminology: flits, floutes & bloxes = fault tolerance = tracing = time estimation = common bloxes = implementation = examples introduction == The industrial revolution was driven by specific technologies. Similarly, the "information revolution" is now in full pace using a different array of new tools. However, it is unlikely that all information processing tools have been invented yet. This paper is a preliminary sketch of one such potentially significant tool called DIAL. DIAL stands for "directed information assembly line". This document contains a description of this novel information processing architecture. This proposal outlines the idea and contrasts it with existing systems, showing how it highlights certain aspects of data processing that are not specifically addressed in, but seem to be implied by, other existing tools. The system can be viewed variously as a programming environment, a monitoring system, a revision control system, a quality assurance and quality control mechanism, a fault tolerant computing system, a workflow (re)engineering technology, a company intranet routing algorithm, an operating system based on virtual reality, etc. terminology == DIAL is something like a "dataflow" oriented system. This document will not describe any specific implementation of the DIAL concepts (such as giving a language syntax) but instead focus on its abstract properties which can be implemented in multiple ways. There are 3 basic components in the DIAL universe: "flit" - a flit is a unit of information that has an associated DIAL state. The information can change over time. One aspect of state is "location". "flit" stands for "fleeting bit", i.e. a piece of data that can "move". The number of binary bits allocated to a flit may vary per flit or over time. "floute" - a floute is a "flit route" or a path that a flit can take. Conceptually flits move through the flouts. A flout can be implemented in various ways, such as "last in, first out", "first in, last out", a pool of data, etc. "blox" - a blox is a "black box" or a component that changes the information content of a flit. Bloxes are connected to floutes. The DIAL system is recursive in that any of the 3 basic objects can be contained or encoded in the 3 objects. For example, flits can contain flits, bloxes can contain further floutes and bloxes, etc. Conceptually, a DIAL system is a directed graph, with nodes called "bloxes", edges called "floutes", and a superimposed set of things called "flits" that can, over time, "move through" the network. In many cases flits have a natural analogy to messages being passed through the system. Bloxes contain a single internal state, like a regular automaton. They can send requests, and respond to, memory bloxes. DIAL is unlike a programming language in that "time" is considered a key property of what it models. Many languages handle the concept of time implicitly through the use of variables. But programming languages have a computation-centric view of processing, such that programs are seen as directing and operating on data. In DIAL, data is seen as flowing through components, a data-centric view. fault tolerance == To be implemented correctly, the state of a DIAL system at any given time is incorruptable. All operations have total integrity. The system is designed to coordinate unreliable subprocesses and must itself be reliable. In DIAL, a blox is roughly analogous to some kind of process. The process may or not be entirely computational. The blox models an unreliable process. The process is activated when a flit approaches the blox from a connecting floute, at which point the flit "enters" the blox. DIAL handles the protocol of informing the blox (or rather, the process represented by the blox) of the presence of the flit. DIAL keeps track of all flits that are currently being processed by bloxes. The blox should process the flit and push the flit into some other floute which signals it has successfully operated on the flit. Combinations of flits at inputs can be processed, and multiple outputs are supported. The DIAL system allows processing time limit rules to be associated with flits, floutes, and bloxes, such as a floute assigning time limits to flits that move through it, time limits associated with all flits going through particular bloxes, or time limits attached to flits. (The system will have a precedence to these rules.) The motion of flits through floutes is handled by the DIAL system and is incorruptable. Flits can "pile up" in floutes if not processed by connecting bloxes as rapidly as they accumulate. All bloxes may have different amounts of processing times on incoming flits. When a blox fails to "return" a flit in the time limit, the DIAL system can be programmed to automatically take particular countermeasures. The countermeasure programs are associated with flits in the same way the expiration time is (via the flit, a floute, or a blox, and having precedence rules). - DIAL can ask the blox, "have you heard of this flit". The blox can reply, (1) "yes, I am still working on it", or (2) "no, I have not heard of it". The rules can specify possibilities such as resubmitting the flit, cancelling the flit processing and redirecting it elsewhere, propagating other flits into floutes (which might represent message(s) sent to "failure controllers"), etc. - The blox may reply, "the flit has corrupted the blox". This may happen in systems without transaction integrity. Again rules can automatically be followed to try to "clean up" the system by propagating new flits to particular floutes or possibly resubmit the flit. - The blox may not reply. Again, rules for countermeasures can be programmed into the DIAL system. tracing == All flits have unique IDs that can be traced. At any time a query can be sent to the DIAL system, "where is so-and-so flit?" and the system will describe the exact location of the flit. Flits can never vanish, even when bloxes fail to operate correctly. Every flit has a history as well. The flit may contain different information at different times. The system allows some number of earlier information states of the flit to be accessed. Information about the past flow-path of the flit and each associated change in contents (prior and subsequent to entry and exit of a blox) is available. This could be called a "replay" feature. In a query of a DIAL system, it may actually reply, "the flit was deleted", although its earlier states would still be accessable. Another possible response is, "the flit moved out of this DIAL system", but again some number of its penultimate states, while it was still "inside", would still be accessable. Again, customizable rules determine how much history is available. General queries such as "locate all type [y] flits that have not moved within [x] time period" are supported. Past histories of the flits that have moved through flouts and bloxes are also available. The system can support some degree of "global or local rollbacks" in which prior processing flows are reset, redirected, restarted, etc. An ability to locate components based on traffic is supported, such as floutes where current flit queue lengths are of some size, etc. The system allows the assignment of arbitrary version numbers with particular flit states that are also allowed in queries. time estimation == An implementer of a DIAL system might support specialized queries called "time estimates". A flit is passed into a system with a special flag that indicates processing time should be estimated but results should not be computed. The flit flows as far through the system as possible and records time estimates as it passes the bloxes. When it finally emerges, cumulative statistics on the time estimates that would be associated with an actual processing of the flit are available to the requester. Common bloxes == - extracter/combiner Bloxes to extract flits, floutes, or bloxes encoded in flits are available, as well as to create flits that encode any of the same objects. - warehouse The DIAL system can support a flit warehouse in which all flits are stored when they are not being propagated elsewhere through the system. The warehouse is a blox that responds to flit queries in the form, "move flit [x] into floute [y]". (The motion of the flits into and out of the warehouse resembles the checkin and checkout task of RCS software.) - create bloxes In a static DIAL system, all bloxes and floutes are predetermined and fixed. In a dynamic system, the bloxes and floutes may change over time. This is accomplished by feeding special flits into bloxes that can create other bloxes and floutes as their result. Other bloxes can connect them in specified ways. The dynamic system is far more complex and is reserved for specialized situations. - rerouter A special rerouter blox is useful for dealing with new versions of other bloxes. A frequent problem that arises with new versions of software (i.e. a new blox) is that it is incompatible or has bugs. The rerouter blox is capable of rerouting a request to a new version of a blox to an older version when problems arise in the processing. - tester Often results output from bloxes are to be tested for consistency to ensure the bloxes are functioning properly and not returning spurious results. - tester/comparer/rerouter Another useful blox for fault tolerance can take the results of two other bloxes and compare them for discrepancies. Flits output from a new version of a blox could be compared with the flits from the previous version, and automatic actions be taken on any discrepancy (such as passing through the old version while flagging the exception). The comparer allows the system designer to more elegantly deal with regular "upgrades". In fact it is the embodiment of automated regression testing. - isolater Often input of flits is batched into groups, and some flit in the group may cause problems in the processing, but further detail in revealing the exact "bad" flit is not available. The isolater can automate this process in an algorithm that resembles the classic linear search. The input batch into a blox is consecutively split into halves by the isolater until the smallest erroneous pieces are isolated and hilighted. - global versioner Often a system with many bloxes has new versions of the bloxes, and there is need in globally switching control to new bloxes. This can be accomplished with the use of a special global versioner blox. It can keep track of the different versions of all bloxes in a given DIAL configuration, and switch between configurations. - bad blox isolater Combining many of the previous bloxes, a special system that automatically isolates new versions of bloxes that fail to be backward and/or forward compatible is possible. This is a direct implementation of the software development pipeline. implementation == The DIAL system should be implemented graphically and visually. A corresponding language specification to describe a DIAL network is possible and desirable, but a visual or graphical interface to any operation should always be possible; likewise a representation of all DIAL states should be supported. Ideally, the DIAL universe could be visualized in a 3d virtual reality, and a person could physically "grasp" and manipulate all the basic objects (flits, floutes, bloxes). The one-to-one visual representation of DIAL and its states is a key aspect of its accessability and usefulness. There should be in principle no memory limits on the core DIAL entities: flits, floutes, or bloxes. Limits in implementations such as "a maximum of 50 queued flits per floute is supported" are antithetical to the basic design principles. However there may be various memory limits associated with particular uses of the entities that model a particular application. The DIAL system internally must have many protections that maintain its internal consistency at all times to prevent corruption of its state. However, unreliability of all components it models is allowed (and specifically designed for). The states of components are allowed to be inconsistent to some degree, such that a state like "blox went down" might arise at a random time. Typically in implementations, the "last known state" of a blox would be tracked, along with protocols to retrieve the most recent state and allow resetting or other manipulations of the state. Recent new technologies such as the Web and Java may be excellent environments for implementing aspects of the DIAL specification. Recent widespread interest and developments in intranets, workflow analysis, and reengineering may be very tangibly furthered through the introduction and use of DIAL systems. examples == 1. a DIAL system can be used to model the workflow of a company. Individual flits can be thought of as documents, and bloxes are the operations that transform the documents. The history mechanisms are identical to revision control. Floutes represent interactions or communication paths between people or departments. This is the primary "information assembly line" application of the DIAL concept. In this system, the "paperwork" cannot be lost because of the inherent properties of DIAL. In fact it is in these sitations that history, tracing, and "timeout" mechanisms are most valuable. The estimation feature gives an approximation of "how long will this take when submitted". 2. a DIAL system could reflect the internal state of packets on a network. Individual computers, routers, etc. are seen as bloxes, and messages are the flits, and floutes are the network routes. The DIAL tracing features are especially useful in this context. 3. a DIAL system could represent a large, complex software project. Individual bloxes are the components in the program. The versioning capabilities deal with the development pipeline. Regression testing and the process of moving to new versions is explicitly built into the system. 4. the DIAL system might represent data being sent over the World Wide Web. 5. the DIAL system could represent a distributed computing system in which the bloxes are spread out over the Internet (the floutes), and socket communication is used to transport flits. 6. In many debugging situations, "bad data" is detected without any knowledge of its history and complex measures are employed to try to deduce the prior processes that led to it. The history mechanism in DIAL allows a user to trace prior states of the flit and find the exact point or blox where the corruption occured. Also, the capability of putting a "rider" on a flit that detects when it is modified in some way is possible. 7. There is a natural correspondence between every computer language and the DIAL system. Subroutines or arithmetic operations are like bloxes, and parameters are passed in floutes. Variables are the contents of specific "floutes" at various points in processing.
participants (1)
-
Vladimir Z. Nuri