pm4py.algo.filtering.dfg package

Submodules

pm4py.algo.filtering.dfg.dfg_filtering module

This file is part of PM4Py (More Info: https://pm4py.fit.fraunhofer.de).

PM4Py is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

PM4Py is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with PM4Py. If not, see <https://www.gnu.org/licenses/>.

pm4py.algo.filtering.dfg.dfg_filtering.clean_dfg_based_on_noise_thresh(dfg, activities, noise_threshold, parameters=None)[source]

Clean Directly-Follows graph based on noise threshold

Parameters:
  • dfg – Directly-Follows graph
  • activities – Activities in the DFG graph
  • noise_threshold – Noise threshold
Returns:

Cleaned dfg based on noise threshold

Return type:

newDfg

pm4py.algo.filtering.dfg.dfg_filtering.filter_dfg_contain_activity(dfg0, start_activities0, end_activities0, activities_count0, activity, parameters=None)[source]

Filters the DFG keeping only nodes that can reach / are reachable from activity

Parameters:
  • dfg0 – Directly-follows graph
  • start_activities0 – Start activities
  • end_activities0 – End activities
  • activities_count0 – Activities count
  • activity – Activity that should be reachable / should reach all the nodes of the filtered graph
  • parameters – Parameters
Returns:

  • dfg – Filtered DFG
  • start_activities – Filtered start activities
  • end_activities – Filtered end activities
  • activities_count – Filtered activities count

pm4py.algo.filtering.dfg.dfg_filtering.filter_dfg_from_activity(dfg0, start_activities0, end_activities0, activities_count0, source_activity, parameters=None)[source]

Filters the DFG, making “source_activity” the only possible source activity of the graph

Parameters:
  • dfg0 – Directly-follows graph
  • start_activities0 – Start activities
  • end_activities0 – End activities
  • activities_count0 – Activities count
  • source_activity – Source activity (only possible start activity after the filtering)
  • parameters – Parameters
Returns:

  • dfg – Filtered DFG
  • start_activities – Filtered start activities
  • end_activities – Filtered end activities
  • activities_count – Filtered activities count

pm4py.algo.filtering.dfg.dfg_filtering.filter_dfg_keep_connected(dfg0, start_activities0, end_activities0, activities_count0, threshold, keep_all_activities=False)[source]

Filters a DFG (complete, and so connected) on the specified dependency threshold (Heuristics Miner dependency) (but ensuring that every node is still reachable from the start and to the end)

Parameters:
  • dfg0 – (Complete, and so connected) DFG
  • start_activities0 – Start activities
  • end_activities0 – End activities
  • activities_count0 – Activities of the DFG along with their count
  • threshold – Dependency threshold as in the Heuristics Miner
  • keep_all_activities – Decides if all the activities should be kept, or only the ones appearing in the edges with higher threshold (default).
Returns:

  • dfg – (Filtered) DFG
  • start_activities – (Filtered) start activities
  • end_activities – (Filtered) end activities
  • activities_count – (Filtered) activities of the DFG along with their count

pm4py.algo.filtering.dfg.dfg_filtering.filter_dfg_on_activities_percentage(dfg0, start_activities0, end_activities0, activities_count0, percentage)[source]

Filters a DFG (complete, and so connected) on the specified percentage of activities (but ensuring that every node is still reachable from the start and to the end)

Parameters:
  • dfg0 – (Complete, and so connected) DFG
  • start_activities0 – Start activities
  • end_activities0 – End activities
  • activities_count0 – Activities of the DFG along with their count
  • percentage – Percentage of activities
Returns:

  • dfg – (Filtered) DFG
  • start_activities – (Filtered) start activities
  • end_activities – (Filtered) end activities
  • activities_count – (Filtered) activities of the DFG along with their count

pm4py.algo.filtering.dfg.dfg_filtering.filter_dfg_on_paths_percentage(dfg0, start_activities0, end_activities0, activities_count0, percentage, keep_all_activities=False)[source]

Filters a DFG (complete, and so connected) on the specified percentage of paths (but ensuring that every node is still reachable from the start and to the end)

Parameters:
  • dfg0 – (Complete, and so connected) DFG
  • start_activities0 – Start activities
  • end_activities0 – End activities
  • activities_count0 – Activities of the DFG along with their count
  • percentage – Percentage of paths
  • keep_all_activities – Decides if all the activities (also the ones connected by the low occurrences edges) should be kept, or only the ones appearing in the edges with more occurrences (default).
Returns:

  • dfg – (Filtered) DFG
  • start_activities – (Filtered) start activities
  • end_activities – (Filtered) end activities
  • activities_count – (Filtered) activities of the DFG along with their count

pm4py.algo.filtering.dfg.dfg_filtering.filter_dfg_to_activity(dfg0, start_activities0, end_activities0, activities_count0, target_activity, parameters=None)[source]

Filters the DFG, making “target_activity” the only possible end activity of the graph

Parameters:
  • dfg0 – Directly-follows graph
  • start_activities0 – Start activities
  • end_activities0 – End activities
  • activities_count0 – Activities count
  • target_activity – Target activity (only possible end activity after the filtering)
  • parameters – Parameters
Returns:

  • dfg – Filtered DFG
  • start_activities – Filtered start activities
  • end_activities – Filtered end activities
  • activities_count – Filtered activities count

pm4py.algo.filtering.dfg.dfg_filtering.generate_nx_graph_from_dfg(dfg, start_activities, end_activities, activities_count)[source]

Generate a NetworkX graph for reachability-checking purposes out of the DFG

Parameters:
  • dfg – DFG
  • start_activities – Start activities
  • end_activities – End activities
  • activities_count – Activities of the DFG along with their count
Returns:

  • G – NetworkX digraph
  • start_node – Identifier of the start node (connected to all the start activities)
  • end_node – Identifier of the end node (connected to all the end activities)

Module contents

This file is part of PM4Py (More Info: https://pm4py.fit.fraunhofer.de).

PM4Py is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

PM4Py is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with PM4Py. If not, see <https://www.gnu.org/licenses/>.