A New Approach to manage Complexity:
Analysis of Relevance based on Tolerability of goal-adverse Factors
more information

Have you ever wondered whether a workgroup is successful because Fred is its leader or whether it is successful although he is the leader? Have you ever wondered whether a product is successful because red is its color or whether it is successful although its color is red?

Whenever several factors of different importance act on a common goal, the magnitude of influence of factors quite often is not simply obvious. The result of the hereby presented analytical procedure is an evaluation of the importance of all recorded factors plus a classification of the alternatives given as an input.

  · Ideal variants are discovered. These variants are characterized by solely goal-promoting attributes. If one of the ideal attributes is missed by accident, it is most likely that the result is nevertheless acceptable. Failing to perform in one or more less important respects is tolerated.
  · Alternatives are highlighted if their success could not be expected from comparing all the other acceptable alternatives. These alternatives should be examined further, because unknown factors could be missing from analysis.
  · Important factors are separated from less important factors.

This procedure assists to analyse complex input-output-systems that are characterized by two levels of output-strength: the level of output is either acceptable or inacceptable. The object analysed can be e.g. a workgroup, where the team performance is either acceptable or inacceptable. Another example is a technical system that works or fails. As well, different designs of a product can be acceptable or inacceptable. The point is: Assigning intermediate levels of performance makes no sense here.

The procedure is new and it is not based on any known statistical method. And there are differences between this method and neural networks. It is an artificial intelligence approach and within those methods a hybrid of symbolic and sub-symbolic representation with an evolutionary learning component. The method is not statistically but inferred from evolutionary methods. Within statistical methods it would be classified as a dependency analysis with nominally scaled independent variables and with a dichotomic scaled dependent variable. The independent variables are assumed to be linearly combined. The procedure highlights specific data sets from all known and at the same time it calculates values that represent the importance of the independent variables (indirectly via the tolerability of arbitrary values).

Example of an analytical task

A team shall consist of five positions: A team leader, a substitute and three specialists with different responsibilities. Everybody cooperates with everybody else such that a complex network of interaction exists.
Each role within the group can be filled with different persons. Not everybody is adequate for each role. Knowledge, experience and abilities as well as soft factors determine group performance. Possible weaknesses within the group are tolerated as long these weaknesses are not important enough to let the group as a whole fail.
A planner needs to know what is the best person on each role within the workgroup. Although each combination of persons that performs in an acceptable manner is equally acceptable there are differences: In case a group member has to be replaced by somebody else unexpectedly, the others shall be ideal in a way to interact fruitfully with the new member. In this sense, the ideal person shall be appointed for each role. Among a multitude of combinations with acceptable results, there are combinations that are preferable.
It is advantageous to know what position within the group is the most important for group performance. A role, that has critical influence on group performance can not be staffed with just anybody. There are few alternatives and weaknesses are tolerated least.
The task of analysing the best team members fulfills the requirements for an analysis by the presented method:

  · all variants can be qualified as being either acceptable or inacceptable.
  · the factors are of categorial and not numeric nature.
  · goal-adverse values in certain factors are tolerated to a certain extend.

The following teams are expected to show acceptable team performance and all other combinations of team members are expected to deliver inacceptable performance

  Team leader Substitute Specialist A Specialist B Specialist C
Team 1 Fred Barbara Mike Walter Lisa
Team 2 Barbara Tom Mike Walter Lisa
Team 3 Fred Tom Mike Walter Boris
Team 4 Fred Barbara Mike Curt Lisa
Team 5 Fred Tom Mike Curt Mandy
Team 6 Fred Tom Mike Curt Lisa
Team 7 Fred Barbara Mike Curt Boris
Team 8 Fred Tom Mike Walter Mandy
Team 9 Fred Tom David Walter Mandy
Team 10 Barbara Tom Mike Walter Mandy
Team 11 Fred Barbara Mike Walter Mandy

If in a team with acceptable team performance a role can be filled with more than one person, there are two options possible: Either this position is critically important for group performance. Then each of the persons on this role in successful teams can be considered as being an ideal choice. Or this position is less than critically important for team performance. In this case the position can be staffed non-ideally and nevertheless group performance remains acceptable.
The more ideal staffings exist and the less important the positions are for group performance, the more combinations of different persons can be observed as being successful. A non-ideal person on a position endangers the success of the whole group. Group performance possibly turns inacceptable - the more important the position of this person is the more likely group failure becomes. The influence of several non-ideal filled positions adds up to a value that determines whether group performance as a whole is above the threshold of acceptability.
The ideal person in a position and the importance of a position for group performance can not be read directly. They solely can be inferred from comparisons of different successful combinations of team members. Only one statement is sure: In all teams with acceptable performance the positions staffed with non-ideal team members these non-ideal positions are not important enough to let these teams fail.

The procedure supports analysis of factors acting directly on the success criterion. This means, inhibiting factors can not or only partially be compensated by opposing promoting factors. The application scenario of the method includes cases where each factor can inhibit success of an alternative and there is no way to compensate for goal-adverse values in critical factors.
If we would suppose that some factors could compensate for other factors, that would imply that factor variations lead to monotonic output variations. Otherwise, if there were threshold values, we could not counterbalance the effect of variations of antagonistic factors. The option of compensation implies that no factor is critically important because even intolerable goal-adverse values could be compensated. If we assume compensation of effects between factors the use of the procedure is inadequate.
On the other hand, if we assume that at least some of the factors can be critically important, the method supports analysis.
There are different kinds of factors: Those that are best described using numbers, e.g. numbers of clients, turnover, etc.. When you measure the influence of these factors, it is easy to say what influence from variations of factor input derives.
  A car manufacturer compares different models, each with a top speed of X and a fuel consumption of Y.
There are factors which can not be described by numbers.
  A radio station plays music of different styles. Each style contributes to the radio stations success in a different way.
Each style is a category which can be compared, identified as being equal or different. But it is difficult to establish a "distance" between styles as it is possible with different top speeds of different cars. As a surrogate, categorial factors often are transferred into numeric factors. The most prominent metric is money:
  A company needs to hire a computer specialist. One candidate costs 60,000 a year, another expects 100,000 a year. A third expects 80,000 a year. Following this surrogate metric this candidates overall contribution to company success should be in between those first two candidates.
The procedure is aimed at categorial factors. As with metric factors different alternatives rarely show the same values, so that the presented procedure would run into problems. It is based on comparisons, on stating identity or difference. If almost no values were identical, the analysis would bring no results. A solution could be establishing classes of values within a numeric factor. Classes can be compared like categories as being equal or different at different variants. We should keep in mind that this means loosing information when transfering metric factors into classes. Slight differences between alternatives assigned to one class are ignored and information is lost. Consequently, other instruments of analysis should be preferred when metric factors are evaluated.
Ignoring the "distance" between different values is a drawback of the procedure. At the same time, it is an advantage: Other analytical procedures equal small variations of factor-input implicitly with small variations in output. That holds true in many cases but not always. It can be unjustified,
  e.g. if a threshold price tag is reached and a small increase in price lets sales figures break down unproportionally.
Software based on the procedure supports data collection and evaluates the data provided by the analyst. The analyst adds new examples to the example base and states whether these examples perform acceptable or inacceptable. Thereafter, the computer program begins the calculations.
The analyst contributes new factors and adds the respective values for all examples. It is the analysts responsibility to assure the completeness and validity of the data. The software records the data and finds the relevance values. All further procedure steps necessary to calculate relevance information is done autonomously by the program. The analyst is not required to give further directions to the program. So the analyst can concentrate on providing data input to the program. From all examples the software identifies those variants that come closest to the ideal variants. Additionally, those examples are found that differ in important factors from all other examples. The software consists of three components:

  1. A first module manages the input of new examples and revisions of known examples. This module shall automaticly check the consistency of input and shall be intuitively usable. The user interfaces guides through the analysis. The analyst determines what factors shall be included in the procedure.
  2. A second module generates relevance hypothesis from comparisons of the examples.
  3. A third module combines both preceeding modules. The plausibility of all available examples and the plausibility of the generated relevance hypothesis is checked. As a result, those variants are marked that differ extremely from all other variants. Additionally, the examples are marked that seem to show ideal values.

Generating the example base is comparable with building a large table.
In a first step the user enters the names of the elements of analysis. If the example base were a table each name of a factor would be the head of a table column. One column contains the depending variable success of the variant.
Row by row the description of the examples follows. The first column contains the names of the examples. Each of the following colums contains a value for every factor. Cell by cell is filled with a value. Entering these values starts with clicking the menue item "Combine factors". On top of the selection box the buttons "Insert" and "Delete" appear. These are the tools to combine new examples.
By clicking the button "Current example" the selection box shows the names of all examples according to the current active filter. A mouse click selects an example and thereby the row of the table that is modified next.
The label below each factor button shows the values that are already assigned to this example. The cells of the table that still have no value are represented by a question mark. To fill those cells with a value the analyst clicks on the factor button. Automaticly, the selection box shows all values that are assigned to this factor. By a mouse click a value can be selected from the selection box and thereby assigned to this example. The program now shows this value in the label below the factor button instead of the question mark.
The analyst repeats this step for each factor marked by a question mark. If the analyst wants to modify previous input, the value can be changed simply by clicking a factor and then selecting another value. When a row is completed and each factor is associated with a value the user can save the input by clicking the button "Insert". This can be done even when a row is not filled completely. The analyst can save any preliminary input and then switch over to another example.
The program then executes an examination of the new example. It is compared with all previously entered variants: If this example was entered before and whether it counted as being acceptable before and not now or the other way round. It is verified whether a formerly hidden example now counts as being active or reversely. It is checked whether an example with this name was entered before with different values or under the same values under a different name. If there are inconsistencies the user is informed.
By clicking "Delete" very easy whole rows of the table can be deleted. Editing the table is completed when each active example in each active factor is associated with a value. While combining the elements the analyst can return to step 1 and enter new example names, new factors and new values. The input procedure is open for additional elements and thereby accounts for new information in a partially understood domain.
The core of the procedure starts with randomly selecting one of the successful variants.This example is now compared with all other examples and called the candidate. All other examples are the reference examples.
Why should inacceptable variants not be taken as candidate? The search for importance values makes only sense if factors are considered as being important. If they were unimportant their registration, evaluation and even more their manipulation is nonsense. Because of that, the hypothesis of important factors leads to the assumption, that many more combinations of values are inacceptable and only a few combinations show acceptable results. Random combinations of values most likely show inacceptable success. This majority of inacceptable variants forms a weak basis for expectations on the relevance of factors and allows only supporting inferences.
The evaluation of an acceptable example (as the candidate) ends with generating the intermediate result vector. Afterwards, the procedure switches over to the next acceptable example that has not been used as a candidate example yet. Successively all acceptable variants serve as a candidate example and the rest of the examples become reference examples. Every candidate differs in another way from ideal and therefore gives different information on the relevance of the factors. The counter matrix is initiated again and adaption cycles begin until the intermediate result vector belonging to each candidate is found.
To each candidate example the procedure generates a well-founded tolerance vector after many adaption cycles (well-foundedness is a property described in the last paragraph of this page). Each well-founded tolerance vector is transformed into one intermediate result vector. The intermediate result vector represents properties of the relationship between candidate example and reference examples.

  Team leader Substitute Specialist A Specialist B Specialist C
Team 1 .6 .3 1.0 .0 .0
Team 2 .3 .5 1.0 .5 .0
Team 3 .5 .4 .6 .4 .0
Team 4 1.0 .0 1.0 .0 .0
Team 5 .6 .5 .8 .0 .0
Team 6 .6 .0 .9 .2 .0
Team 7 1.0 .3 1.0 .3 .0
Team 8 .0 (Fred) .0 (Tom) .0 (Mike) .0 (Walter) .0 (Mandy)
Team 9 .7 .8 .0 .6 .4
Team 10 .0 .7 .6 .7 .0
Team 11 .8 .0 .8 .6 .0
Relevance .6 .3 .7 .3 .0

From the intermediate results of the procedure team 8 can be identified as being ideal. From case-specific relevance values the software generates average values that describe the importance of the different positions for group performance. The most important position for success is specialist A, the least important is specialist C.

All intermediate result vectors are fused to one single relevance vector. The relevance is understood as the average of all intermediate results of a factor. The relevance vector is the combination of all relevance values. The more similar the variants are the lower the relevance. Strongly differing variants make the factors look little tolerable which leads to high relevance values.
Each example base leads to exactly one relevance vector. It shows the importance of the factors. From an example base as many intermediate result vectors can be derived as example base contains successful examples. This follows from using each acceptable variant as candidate example and generating the intermediate result vector by comparing it with all the other acceptable examples. From the average of all intermediate results the relevance vector is created.

The intermediate result vector is derived by testing the tolerability on presumably non-ideal values. These tolerance hypothesis are initially established randomly and later more and more based on previously confirmed tolerability hypothesis. A matrix of counter values serves as a memory for earlier adaption cycles. This counter matrix adjusts itself after repeated comparisons and later shows a stable distribution of counter values. Mature counter matrices show two intervals in each factor:
One interval is represented only by well-founded counter values. Tolerance options up to a certain tolerance option are confirmed by high counter values. In the other interval, only counter values below a treshold value are registered. The information on the proportion of each counter value in the sum of each dimension is condensed to specifying the border between those intervals.
The evaluation of a candidate example results in an intermediate result in each dimension. This is determined by the largest well-founded tolerance option in each dimension. But it is not the largest tolerance option itself. The combination of all intermediate results is an intermediate result vector. To each candidate example, there is an intermediate result vector. This intermediate result compresses the information from a counter matrix by neglecting differences between the counter values and reducing their values to the attribute of well-foundedness.
A low tolerance indicates that a dimension is important from the perspective of the current candidate example. To express high importance by high values, the tolerance values are transformed into their complementary values without loosing any information content.
The intermediate result of a factor is defined by

Intermediate result = 1 - (highest well-founded tolerance value + 0.1) = 1 - lowest not well-founded tolerance option
The intermediate result vector consists of the intermediate results. To each candidate example belongs one intermediate result vector.

  A factor is represented by large counter values from 0.0 up to 0.8. This range of tolerance options is well-founded. The value of 0.9 is represented only by a very small counter value. That means this tolerance option does not represent the tolerability of non-ideal values in this dimension.
  To the well-founded tolerance vector of
  { 0.2; 0.6; 0.9; 0.9; 0.7; }
  belongs the intermediate vector
  { 0.7; 0.3; 0.0; 0.0; 0.2; }

As soon the counter matrix shows a mature distribution the intermediate result vector can be read from the counter matrix. The intermediate result vector is saved as current intermediate result vector. Now the processing of the current candidate example could be terminated. To verify this result it is advantageous to enter the adaption cycles again and to generate a new intermediate result vector. If both vectors are identical the intermediate result can be considered reliable. Otherwise a renewed intermediate result vector should be generated and compared with the previous result. When the results are identical or differ only slightly, the loop can be left. The environment around values that is considered as being similar enough is set in the beginning of the procedure. When consecutive intermediate results coincide, this procedure step is finished. The intermediate result is saved. It represents the relevance as suggested by the example base from the perspective of the current candidate example. By that, the first partial result is gained.