Mining and reasoning on workflows

Abstract

Today’s workflow management systems represent a key technological infrastructure for advanced applications that is attracting a growing body of research, mainly focused in developing tools for workflow management, that allow users both to specify the “static” aspects, like preconditions, precedences among activities, and rules for exception handling, and to control its execution by scheduling the activities on the available resources. This paper deals with an aspect of workflows which has so far not received much attention even though it is crucial for the forthcoming scenarios of large scale applications on the Web: providing facilities for the human system administrator for identifying the choices performed more frequently in the past that had lead to a desired final configuration. In this context, we formalize the problem of discovering the most frequent patterns of executions, i.e., the workflow substructures that have been scheduled more frequently by the system. We attacked the problem by developing two data mining algorithms on the basis of an intuitive and original graph formalization of a workflow schema and its occurrences. The model is used both to prove some intractability results that strongly motivate the use of data mining techniques and to derive interesting structural properties for reducing the search space for frequent patterns. Indeed, the experiments we have carried out show that our algorithms outperform standard data mining algorithms adapted to discover frequent patterns of workflow executions.

Publication
IEEE Transactions on Knowledge and Data Engineering