Clustering to determine variables for a linear program

Question

I have a database containing the execution logs of some manufacturing process. A particular machine can perform various operations on products during a shift. My job is to calculate the nominal operation times on this particular machine.

For example, the machine can perform operations A, B, etc. On a given shift, there could be a mixture of these operations happening, such as

10 * A + 5 * B = 480 min

I.e., the machine performs (non-overlappingly) 10 of the operation A and 5 of the operation B within a shift of 8 hours. I have many such examples, thus I got the tip to use linear programming to determine how long the operations A, B, etc. would take. The shift lengths and number of occurrences is known in advance.

It turns out the operation code (i.e., A, B, etc.) does not properly correlate with the actual time it took to process an item. Each item has a quality attribute which implies an operation can be done faster or slower. The attribute is just a string identifier and there is no direct way to tell a speed factor just by looking at the string. What is known is that there are a set of distinct speed factors possible with any operation code (A, B, etc.), but not all combinations may be realized.

Thus, for the linear program to provide the correct operation times, the variables (and the occurrence counts) have to be properly separated based on which speed factor category they belong. For example:

5 * $A_1$ + 2 * $A_2$ + 3 * $A_3$ + 1 * $B_1$ + 4 * $B_2$ = 480 min

Where $A_1$ represents an operation A with a particular distinct speed factor 1, etc.

So, I could cluster the quality attribute string into speed factors if I knew the operation time. To know the operation time, I need the LP to feature the properly separated variables and counts. The trivial case is when an entire shift consists of items of the same quality, thus the operation time is the shift time divided by the number of items. I tried eliminating this way but only a handful of shifts are such clean; the rest have 3+ distinct quality attribute values.

How can I resolve this cyclic dependency between the two calculations (iterative back-and-forth, but how)?

Clustering to determine variables for a linear program

0 Answers0