pm4py.algo.discovery.correlation_mining.variants package¶
Submodules¶
pm4py.algo.discovery.correlation_mining.variants.classic module¶
This file is part of PM4Py (More Info: https://pm4py.fit.fraunhofer.de).
PM4Py is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
PM4Py is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with PM4Py. If not, see <https://www.gnu.org/licenses/>.
-
class
pm4py.algo.discovery.correlation_mining.variants.classic.Parameters[source]¶ Bases:
enum.EnumAn enumeration.
-
ACTIVITY_KEY= 'pm4py:param:activity_key'¶
-
EXACT_TIME_MATCHING= 'exact_time_matching'¶
-
INDEX_KEY= 'index_key'¶
-
START_TIMESTAMP_KEY= 'pm4py:param:start_timestamp_key'¶
-
TIMESTAMP_KEY= 'pm4py:param:timestamp_key'¶
-
-
pm4py.algo.discovery.correlation_mining.variants.classic.apply(log: Union[pm4py.objects.log.obj.EventLog, pm4py.objects.log.obj.EventStream, pandas.core.frame.DataFrame], parameters: Optional[Dict[Union[str, pm4py.algo.discovery.correlation_mining.variants.classic.Parameters], Any]] = None) → Tuple[Dict[Tuple[str, str], int], Dict[Tuple[str, str], float]][source]¶ Apply the correlation miner to an event stream (other types of logs are converted to that)
The approach is described in: Pourmirza, Shaya, Remco Dijkman, and Paul Grefen. “Correlation miner: mining business process models and event correlations without case identifiers.” International Journal of Cooperative Information Systems 26.02 (2017): 1742002.
Parameters: - log – Log object
- parameters – Parameters of the algorithm
Returns: - dfg – DFG
- performance_dfg – Performance DFG (containing the estimated performance for the arcs)
-
pm4py.algo.discovery.correlation_mining.variants.classic.get_PS_dur_matrix(activities_grouped, activities, parameters=None)[source]¶ Combined methods to get the two matrixes
Parameters: - activities_grouped – Grouped activities
- activities – List of activities of the log
- parameters – Parameters of the algorithm
Returns: - PS_matrix – Precede-succeed matrix
- duration_matrix – Duration matrix
-
pm4py.algo.discovery.correlation_mining.variants.classic.get_duration_matrix(activities, activities_grouped, timestamp_key, start_timestamp_key, exact=False)[source]¶ Calculates the duration matrix
Parameters: - activities – Ordered list of activities of the log
- activities_grouped – Grouped list of activities
- timestamp_key – Timestamp key
- start_timestamp_key – Start timestamp key (events start)
- exact – Performs an exact matching of the times (True/False)
Returns: Duration matrix
Return type: duration_matrix
-
pm4py.algo.discovery.correlation_mining.variants.classic.get_precede_succeed_matrix(activities, activities_grouped, timestamp_key, start_timestamp_key)[source]¶ Calculates the precede succeed matrix
Parameters: - activities – Ordered list of activities of the log
- activities_grouped – Grouped list of activities
- timestamp_key – Timestamp key
- start_timestamp_key – Start timestamp key (events start)
Returns: Precede succeed matrix
Return type: precede_succeed_matrix
-
pm4py.algo.discovery.correlation_mining.variants.classic.preprocess_log(log, activities=None, parameters=None)[source]¶ Preprocess a log to enable correlation mining
Parameters: - log – Log object
- activities – (if provided) list of activities of the log
- parameters – Parameters of the algorithm
Returns: - transf_stream – Transformed stream
- activities_grouped – Grouped activities
- activities – List of activities of the log
-
pm4py.algo.discovery.correlation_mining.variants.classic.resolve_lp_get_dfg(PS_matrix, duration_matrix, activities, activities_counter)[source]¶ Resolves a LP problem to get a DFG
Parameters: - PS_matrix – Precede-succeed matrix
- duration_matrix – Duration matrix
- activities – List of activities of the log
- activities_counter – Counter of the activities
Returns: - dfg – DFG
- performance_dfg – Performance DFG (containing the estimated performance for the arcs)
pm4py.algo.discovery.correlation_mining.variants.classic_split module¶
This file is part of PM4Py (More Info: https://pm4py.fit.fraunhofer.de).
PM4Py is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
PM4Py is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with PM4Py. If not, see <https://www.gnu.org/licenses/>.
-
class
pm4py.algo.discovery.correlation_mining.variants.classic_split.Parameters[source]¶ Bases:
enum.EnumAn enumeration.
-
ACTIVITY_KEY= 'pm4py:param:activity_key'¶
-
SAMPLE_SIZE= 'sample_size'¶
-
START_TIMESTAMP_KEY= 'pm4py:param:start_timestamp_key'¶
-
TIMESTAMP_KEY= 'pm4py:param:timestamp_key'¶
-
-
pm4py.algo.discovery.correlation_mining.variants.classic_split.apply(log: Union[pm4py.objects.log.obj.EventLog, pm4py.objects.log.obj.EventStream, pandas.core.frame.DataFrame], parameters: Optional[Dict[Union[str, pm4py.algo.discovery.correlation_mining.variants.classic_split.Parameters], Any]] = None) → Tuple[Dict[Tuple[str, str], int], Dict[Tuple[str, str], float]][source]¶ Applies the correlation miner (splits the log in smaller chunks)
Parameters: - log – Log object
- parameters – Parameters of the algorithm
Returns: - dfg – Frequency DFG
- performance_dfg – Performance DFG
pm4py.algo.discovery.correlation_mining.variants.trace_based module¶
This file is part of PM4Py (More Info: https://pm4py.fit.fraunhofer.de).
PM4Py is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
PM4Py is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with PM4Py. If not, see <https://www.gnu.org/licenses/>.
-
class
pm4py.algo.discovery.correlation_mining.variants.trace_based.Parameters[source]¶ Bases:
enum.EnumAn enumeration.
-
ACTIVITY_KEY= 'pm4py:param:activity_key'¶
-
CASE_ID_KEY= 'pm4py:param:case_id_key'¶
-
INDEX_KEY= 'index_key'¶
-
START_TIMESTAMP_KEY= 'pm4py:param:start_timestamp_key'¶
-
TIMESTAMP_KEY= 'pm4py:param:timestamp_key'¶
-
-
pm4py.algo.discovery.correlation_mining.variants.trace_based.apply(log: Union[pm4py.objects.log.obj.EventLog, pm4py.objects.log.obj.EventStream, pandas.core.frame.DataFrame], parameters: Optional[Dict[Union[str, pm4py.algo.discovery.correlation_mining.variants.trace_based.Parameters], Any]] = None) → Tuple[Dict[Tuple[str, str], int], Dict[Tuple[str, str], float]][source]¶ Novel approach of correlation mining, that creates the PS-matrix and the duration matrix using the order list of events of each trace of the log
Parameters: - log – Event log
- parameters – Parameters
Returns: - dfg – DFG
- performance_dfg – Performance DFG (containing the estimated performance for the arcs)
-
pm4py.algo.discovery.correlation_mining.variants.trace_based.get_PS_duration_matrix(activities, trace_grouped_list, parameters=None)[source]¶ Gets the precede-succeed matrix
Parameters: - activities – Activities
- trace_grouped_list – Grouped list of simplified traces (per activity)
- parameters – Parameters of the algorithm
Returns: - PS_matrix – precede-succeed matrix
- duration_matrix – Duration matrix
-
pm4py.algo.discovery.correlation_mining.variants.trace_based.get_duration_matrix(activities, trace_grouped_list, timestamp_key, start_timestamp_key)[source]¶ Calculates the duration matrix
Parameters: - activities – Sorted list of activities of the log
- trace_grouped_list – A list of lists of lists, containing for each trace and each activity the events having such activity
- timestamp_key – The key to be used as timestamp
- start_timestamp_key – The key to be used as start timestamp
Returns: The duration matrix
Return type: mat
-
pm4py.algo.discovery.correlation_mining.variants.trace_based.get_precede_succeed_matrix(activities, trace_grouped_list, timestamp_key, start_timestamp_key)[source]¶ Calculates the precede succeed matrix
Parameters: - activities – Sorted list of activities of the log
- trace_grouped_list – A list of lists of lists, containing for each trace and each activity the events having such activity
- timestamp_key – The key to be used as timestamp
- start_timestamp_key – The key to be used as start timestamp
Returns: The precede succeed matrix
Return type: mat
-
pm4py.algo.discovery.correlation_mining.variants.trace_based.preprocess_log(log, activities=None, activities_counter=None, parameters=None)[source]¶ Preprocess the log to get a grouped list of simplified traces (per activity)
Parameters: - log – Log object
- activities – (if provided) activities of the log
- activities_counter – (if provided) counter of the activities of the log
- parameters – Parameters of the algorithm
Returns: - traces_list – List of simplified traces of the log
- trace_grouped_list – Grouped list of simplified traces (per activity)
- activities – Activities of the log
- activities_counter – Activities counter
-
pm4py.algo.discovery.correlation_mining.variants.trace_based.resolve_lp_get_dfg(PS_matrix, duration_matrix, activities, activities_counter)[source]¶ Resolves a LP problem to get a DFG
Parameters: - PS_matrix – Precede-succeed matrix
- duration_matrix – Duration matrix
- activities – List of activities of the log
- activities_counter – Counter for the activities of the log
Returns: - dfg – Frequency DFG
- performance_dfg – Performance DFG
Module contents¶
This file is part of PM4Py (More Info: https://pm4py.fit.fraunhofer.de).
PM4Py is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
PM4Py is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with PM4Py. If not, see <https://www.gnu.org/licenses/>.