pm4py.algo.simulation.playout.dfg.variants package¶

Submodules¶

pm4py.algo.simulation.playout.dfg.variants.classic module¶

This file is part of PM4Py (More Info: https://pm4py.fit.fraunhofer.de).

PM4Py is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

PM4Py is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with PM4Py. If not, see <https://www.gnu.org/licenses/>.

class pm4py.algo.simulation.playout.dfg.variants.classic.Parameters[source]¶

Bases: enum.Enum

An enumeration.

ACTIVITY_KEY = 'pm4py:param:activity_key'¶

ADD_TRACE_IF_TAKES_NEW_ELS_TO_DFG = 'add_trace_if_takes_new_els_to_dfg'¶

INTERRUPT_SIMULATION_WHEN_DFG_COMPLETE = 'interrupt_simulation_when_dfg_complete'¶

MAX_EXECUTION_TIME = 'max_execution_time'¶

MAX_NO_OCC_PER_ACTIVITY = 'max_no_occ_per_activitiy'¶

MAX_NO_VARIANTS = 'max_no_variants'¶

MIN_VARIANT_OCC = 'min_variant_occ'¶

MIN_WEIGHTED_PROBABILITY = 'min_weighted_probability'¶

RETURN_ONLY_IF_COMPLETE = 'return_only_if_complete'¶

RETURN_VARIANTS = 'return_variants'¶

TIMESTAMP_KEY = 'pm4py:param:timestamp_key'¶

pm4py.algo.simulation.playout.dfg.variants.classic.apply(dfg: Dict[Tuple[str, str], int], start_activities: Dict[str, int], end_activities: Dict[str, int], parameters: Optional[Dict[Union[str, pm4py.algo.simulation.playout.dfg.variants.classic.Parameters], Any]] = None) → Union[pm4py.objects.log.obj.EventLog, Dict[Tuple[str, str], int]][source]¶

Applies the playout algorithm on a DFG, extracting the most likely traces according to the DFG

Parameters:	dfg – Complete DFG start_activities – Start activities end_activities – End activities parameters – Parameters of the algorithm, including: - Parameters.ACTIVITY_KEY => the activity key of the simulated log - Parameters.TIMESTAMP_KEY => the timestamp key of the simulated log - Parameters.MAX_NO_VARIANTS => the maximum number of variants generated by the method (default: 3000) - Parameters.MIN_WEIGHTED_PROBABILITY => the minimum overall weighted probability that makes the method stop (default: 1) Parameters.MAX_NO_OCC_PER_ACTIVITY => the maximum number of occurrences per activity in the traces of the log (default: 2) Parameters.INTERRUPT_SIMULATION_WHEN_DFG_COMPLETE => interrupts the simulation when the DFG of the simulated log has the same keys to the DFG of the original log (all behavior is contained) (default: False) Parameters.ADD_TRACE_IF_TAKES_NEW_ELS_TO_DFG => adds a simulated trace to the simulated log only if it adds elements to the simulated DFG, e.g., it adds behavior; skip insertion otherwise (default: False) Parameters.RETURN_VARIANTS => returns the traces as variants with a likely number of occurrences
Returns:	Simulated log
Return type:	simulated_log

pm4py.algo.simulation.playout.dfg.variants.classic.get_node_tr_probabilities(dfg, start_activities, end_activities)[source]¶

Gets the transition probabilities between the nodes of a DFG

Parameters:

dfg – DFG
start_activities – Start activities
end_activities – End activities

Returns:

weighted_start_activities – Start activities, with a relative weight going from 0 to 1
node_transition_probabilities – The transition probabilities between the nodes of the DFG (the end node is None)

pm4py.algo.simulation.playout.dfg.variants.classic.get_trace_probability(trace, dfg, start_activities, end_activities, parameters=None)[source]¶

Given a trace of a log, gets its probability given the complete DFG

Parameters:	trace – Trace of a log dfg – Complete DFG start_activities – Start activities of the model end_activities – End activities of the model parameters – Parameters of the algorithm: - Parameters.ACTIVITY_KEY => activity key
Returns:	The probability of the trace according to the DFG
Return type:	prob

pm4py.algo.simulation.playout.dfg.variants.classic.get_traces(dfg, start_activities, end_activities, parameters=None)[source]¶

Gets the most probable traces from the DFG, one-by-one (iterator), until the least probable

Parameters:	dfg – Complete DFG start_activities – Start activities end_activities – End activities parameters – Parameters of the algorithm, including: - Parameters.MAX_NO_OCC_PER_ACTIVITY => the maximum number of occurrences per activity in the traces of the log (default: 2)
Returns:	Trace of the simulation
Return type:	yielded_trace

pm4py.algo.simulation.playout.dfg.variants.performance module¶

This file is part of PM4Py (More Info: https://pm4py.fit.fraunhofer.de).

PM4Py is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

PM4Py is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with PM4Py. If not, see <https://www.gnu.org/licenses/>.

class pm4py.algo.simulation.playout.dfg.variants.performance.Parameters[source]¶

Bases: enum.Enum

An enumeration.

ACTIVITY_KEY = 'pm4py:param:activity_key'¶

CASE_ARRIVAL_RATE = 'case_arrival_rate'¶

CASE_ID_KEY = 'pm4py:param:case_id_key'¶

NUM_TRACES = 'num_traces'¶

PARAM_ARTIFICIAL_END_ACTIVITY = 'pm4py:param:art_end_act'¶

PARAM_ARTIFICIAL_START_ACTIVITY = 'pm4py:param:art_start_act'¶

PERFORMANCE_DFG = 'performance_dfg'¶

TIMESTAMP_KEY = 'pm4py:param:timestamp_key'¶

pm4py.algo.simulation.playout.dfg.variants.performance.apply(frequency_dfg: Dict[Tuple[str, str], int], start_activities: Dict[str, int], end_activities: Dict[str, int], parameters: Optional[Dict[Any, Any]] = None) → pm4py.objects.log.obj.EventLog[source]¶

Simulates a log out with the transition probabilities provided by the frequency DFG, and the time deltas provided by the performance DFG

Parameters:	frequency_dfg – Frequency DFG start_activities – Start activities end_activities – End activities parameters – Parameters of the algorithm, including: - Parameters.NUM_TRACES: the number of traces of the simulated log - Parameters.ACTIVITY_KEY: the activity key to be used in the simulated log - Parameters.TIMESTAMP_KEY: the timestamp key to be used in the simulated log - Parameters.CASE_ID_KEY: the case identifier key to be used in the simulated log - Parameters.CASE_ARRIVAL_RATE: the average distance (in seconds) between the start of two cases (default: 1) - Parameters.PERFORMANCE_DFG: (mandatory) the performance DFG that is used for the time deltas.
Returns:	Simulated log
Return type:	simulated_log

pm4py.algo.simulation.playout.dfg.variants.performance.choice(a, size=None, replace=True, p=None)¶

Generates a random sample from a given 1-D array

New in version 1.7.0.

Note

New code should use the choice method of a default_rng() instance instead; please see the random-quick-start.

Parameters:	a (1-D array-like or int) – If an ndarray, a random sample is generated from its elements. If an int, the random sample is generated as if it were `np.arange(a)` size (int or tuple of ints, optional) – Output shape. If the given shape is, e.g., `(m, n, k)`, then `m * n * k` samples are drawn. Default is None, in which case a single value is returned. replace (boolean, optional) – Whether the sample is with or without replacement. Default is True, meaning that a value of `a` can be selected multiple times. p (1-D array-like, optional) – The probabilities associated with each entry in a. If not given, the sample assumes a uniform distribution over all entries in `a`.
Returns:	samples – The generated random samples
Return type:	single item or ndarray
Raises:	`ValueError` – If a is an int and less than zero, if a or p are not 1-dimensional, if a is an array-like of size 0, if p is not a vector of probabilities, if a and p have different lengths, or if replace=False and the sample size is greater than the population size

See also

randint(), shuffle(), permutation()

Generator.choice(): which should be used in new code

Notes

Setting user-specified probabilities through p uses a more general but less efficient sampler than the default. The general sampler produces a different sample than the optimized sampler even if each element of p is 1 / len(a).

Sampling random rows from a 2-D array is not possible with this function, but is possible with Generator.choice through its axis keyword.

Examples

Generate a uniform random sample from np.arange(5) of size 3:

>>> np.random.choice(5, 3)
array([0, 3, 4]) # random
>>> #This is equivalent to np.random.randint(0,5,3)

Generate a non-uniform random sample from np.arange(5) of size 3:

>>> np.random.choice(5, 3, p=[0.1, 0, 0.3, 0.6, 0])
array([3, 3, 0]) # random

Generate a uniform random sample from np.arange(5) of size 3 without replacement:

>>> np.random.choice(5, 3, replace=False)
array([3,1,0]) # random
>>> #This is equivalent to np.random.permutation(np.arange(5))[:3]

Generate a non-uniform random sample from np.arange(5) of size 3 without replacement:

>>> np.random.choice(5, 3, replace=False, p=[0.1, 0, 0.3, 0.6, 0])
array([2, 3, 0]) # random

Any of the above can be repeated with an arbitrary array-like instead of just integers. For instance:

>>> aa_milne_arr = ['pooh', 'rabbit', 'piglet', 'Christopher']
>>> np.random.choice(aa_milne_arr, 5, p=[0.5, 0.1, 0.1, 0.3])
array(['pooh', 'pooh', 'pooh', 'Christopher', 'piglet'], # random
      dtype='<U11')

pm4py.algo.simulation.playout.dfg.variants.performance.dict_based_choice(dct: Dict[str, float]) → str[source]¶

Performs a weighted choice, given a dictionary associating a weight to each possible choice

Parameters:	dct – Dictionary associating a weight to each choice
Returns:	Choice
Return type:	choice

pm4py.algo.simulation.playout.dfg.variants.performance.exponential(scale=1.0, size=None)¶

Draw samples from an exponential distribution.

Its probability density function is

\[f(x; \frac{1}{\beta}) = \frac{1}{\beta} \exp(-\frac{x}{\beta}),\]

for x > 0 and 0 elsewhere. \(\beta\) is the scale parameter, which is the inverse of the rate parameter \(\lambda = 1/\beta\). The rate parameter is an alternative, widely used parameterization of the exponential distribution [3].

The exponential distribution is a continuous analogue of the geometric distribution. It describes many common situations, such as the size of raindrops measured over many rainstorms [1], or the time between page requests to Wikipedia [2].

Note

New code should use the exponential method of a default_rng() instance instead; please see the random-quick-start.

Parameters:	scale (float or array_like of floats) – The scale parameter, \(\beta = 1/\lambda\). Must be non-negative. size (int or tuple of ints, optional) – Output shape. If the given shape is, e.g., `(m, n, k)`, then `m * n * k` samples are drawn. If size is `None` (default), a single value is returned if `scale` is a scalar. Otherwise, `np.array(scale).size` samples are drawn.
Returns:	out – Drawn samples from the parameterized exponential distribution.
Return type:	ndarray or scalar

Module contents¶

This file is part of PM4Py (More Info: https://pm4py.fit.fraunhofer.de).

PM4Py is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

PM4Py is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with PM4Py. If not, see <https://www.gnu.org/licenses/>.