# Workflow Exploration

Table execution is useful if you need to only select discrete variable values, e.g. plate thickness may be obtainable at 0.5mm, 1mm, 1.5mm, 3mm and 6mm. You can choose also between shapes, e.g. a rod, a tube or a hollow rectangular cross-section. It is straightforward to define all combinations of your input variables and generate a table of cases to execute.

The oldest form of optimisation would be to select the 'best case' from this table execution. This case would be the optimum design.

As explained above, each input has a maximum and a minimum value. For a single variable, this would be visualised as a section of a line. For three design variables, the input max/min would form a cube.

The space spanned by the design inputs is therefore referred to as the design space, and for N inputs, we span an N-dimensional design space. It is obvious that it is very difficult for humans to understand most product design spaces, as they tend to involve a rather large number of design inputs.

A more general phrase for investigating ways to improve your product is Design Space Exploration.

When variables are continuous, i.e. not discrete, table execution can be reduced by looking at variable step size and the change it causes in the design outputs, i.e. to look at (output/input) design sensitivities. It is more effective to follow a trace where input variables show high sensitivity than low sensitivity.

The above type of optimisers are called gradient optimisers. Many FE solvers have such optimisers built in and they can be very effective, as part of the optimisation work can be analytically solved in the FE code where possible.

Gradient type optimisers work very well for problems where the design space is straightforward.

A limitation of gradient optimisers is that they stop at the first minima they find. It is a bit like playing golf blindfolded from an arbitrary start position and stop playing the ball at the first depression you land in as this is your best information on having found the hole. Therefore, to find the optimum design, many starting points must be tested, i.e. several optimisation runs must be made.

A further disadvantage of gradient-based optimisers is the fact that data from one optimisation run cannot be reused in the next run. In other words, gradient-based optimisation runs can be started in parallel, one after the other or a combination thereof, but the execution of each run must be made by itself.

Evolutionary Algorithms

There exists more expensive optimisation methods that guarantee the identification of the global optimum. These methods include Genetic Algorithms (GA) and Simulated Annealing.

GA is inspired by the process of natural selection (Link). The GA makes permutations between the design inputs and studies the output. Depending on how we have defined the desirable design outputs (the optimum), the GA allows certain combinations to permute in the next generation.

Random Optimisation

Random Optimisation is used for cases where functions are not continuous and when inputs are uncertain.

It is a direct search method, therefore gradients and RSM are not used. A normal distribution on input is provided when design space is explored.

The approach is computationally costly.

It is useful to investigate the design space from the best vantage points, to reuse information between experiments and to execute runs/experiments in parallel. Design of Experiments theory makes this possible.

Design of Experiments (DoE) theory is old and dates back to making controlled experiments. There are many variations of DoE. The version discussed here uses a Response Surfaces Model (a.k.a. a Surrogate Model or meta model).

We know that any continuous linear function can be approximated using a Taylor series (Link). A Maclaurin series can also be identified for particular functions.

For an unknown function, we can approximate the Maclaurin series using a Polynom. Imagine two inputs, x1 and x2, and a single design output y that we want to approximate using a limited number of experiments. We can approximate this function as y = A + B*x1 + C*x2 + D*x1^2 + E*x2^2 + F*x1*x2. The unknowns are the six factors A - F. In real-life experiments, we would be required to make more than six experiments to estimate variability.

This implies that we would need a minimum of six experiments to determine the factors A - F. We need only six experiments when there is zero variability in the outputs – as is the case when running a simulation model for a linear problem and many non-linear problems.

Having determined the values for factors A-F, we are able to approximate the design space using the Response Surface Model (RSM) y(x1,x2) shown above. The advantage is that executing a RSM is computationally inexpensive.

Note that:

• The six experiments required to form the RSM my take considerable time to execute.
• The six experiments can be executed independently from each other.

OK - so which values should we use for x1 and x2? A maximum, mean, minimum, or are some other values a better choice?

This is where DoE methods differ. They also differ in how complex a RSM they are able to generate. In short, the more you know about your problem, the more precise can be the selection of DoE method and RMS. The more precise a DoE and RSM, the less general is its application for a new problem. A discussion on this subject is found here (Link).

What is best? The definition of best depends entirely on the problem you need to solve and your information on the problem to solve at the time you define 'what is best'. If you have analysed similar types of problems before, you may know that a certain polynomial works better or requires fewer experiments to determine the RSM. You may even have been able to identify that your RSM polynomial is a certain kind of Maclaurin expansion that approximates a certain kind of function and can draw on this information for reduced effort and improved RSM precision.

We limit the discussion here to two kinds of models for general problems where nothing is known about the problem a priori: Taguchi and Three Level Full Factorial (3LFF).

Both models use the design input’s max/min values. The Taguchi model has versions that do and do not use mid points, while the 3LFF method always uses mid points ((max-min)/2).

The Taguchi model spans a linear RSM, i.e. y = A + B*x1 + BX2, while the 3LFF method generates the RSM by y = A + B*x1 + C*x2 + D*x1*x2 + E*x1^2 + F*x2^2 + G*x1*x2^2 + H*x1^2*x2 + I*x1^3.

The Taguchi model is a screening model and the 3LFF model is an exploration model. Furthermore, the cost of running the Taguchi model is N+1 experiments, while the 3LFF model costs 3^N experiments.

The Taguchi model is therefore used to identify which of the design inputs matter, i.e. to screen the design. After discovering this, you should continue with a more advanced model such as. 3LFF.

As you can see in the above example, we may choose to add or remove RSM coefficients when selecting our RSM model. If, say, the D, G and H coefficients are small, we can simply set up a new RSM model to see if it works better.

There is one more type of RSM that needs to be mentioned: Kriging (Link), which we will explore below.

A polynomial RSM will never exactly pass through the computed points in the design space. A Kriging model will exactly fit the computed points, but may be off in between these points.

Kriging is therefore an excellent tool to refine a RSM for use near an optimum by enriching the RSM with a few extra simulation cases near the optimum. This is done at little or no cost, as you must verify the RSM estimated optimum anyway.

G P. Box once wrote, "All models are wrong, but some are useful," which originates from his book on RSMs.

Having found a RSM that is able to describe the design space, we can now use more expensive optimisation methods that guarantee the identification of the global optimum, e.g. Genetic Algorithms (GA) and Simulated Annealing.

Not all problems can be captured using RSMs. The design space may contain several localised minima or simply be uneven, contain forbidden parts, etc.

If this is the case, we may have to use an advanced optimiser such a GA on the problem. However, we should only do so when we know that this truly is required.

The author’s experience is that a GA optimiser typically needs four or five generations to identify the range of an optimum.

The most desirable design, or the optimum, is in some cases very simple to define, e.g. low weight and high stiffness. However, when executing the workflow, we can add also other types of consideration. For example, we can choose between two or more materials that have different prices. Using a lower weight of a pricy Material A may generate a product that is costlier than the more mundane Material B, which is heavier.

When mixing multiple design considerations, and/or, design disciplines, workflow execution is referred to as Multi-Disciplinary.

One example of a Multi-Disciplinary optimisation of a train wheel is shown here (Link).