pm4py.algo.transformation.log_to_features.variants package¶
Submodules¶
pm4py.algo.transformation.log_to_features.variants.event_based module¶
This file is part of PM4Py (More Info: https://pm4py.fit.fraunhofer.de).
PM4Py is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
PM4Py is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with PM4Py. If not, see <https://www.gnu.org/licenses/>.
-
class
pm4py.algo.transformation.log_to_features.variants.event_based.Parameters[source]¶ Bases:
enum.EnumAn enumeration.
-
FEATURE_NAMES= 'feature_names'¶
-
MAX_NUM_DIFF_STR_VALUES= 'max_num_diff_str_values'¶
-
MIN_NUM_DIFF_STR_VALUES= 'min_num_diff_str_values'¶
-
NUM_EVENT_ATTRIBUTES= 'num_ev_attr'¶
-
STR_EVENT_ATTRIBUTES= 'str_ev_attr'¶
-
-
pm4py.algo.transformation.log_to_features.variants.event_based.apply(log: pm4py.objects.log.obj.EventLog, parameters: Optional[Dict[Union[str, pm4py.algo.transformation.log_to_features.variants.event_based.Parameters], Any]] = None) → Tuple[Any, List[str]][source]¶ Extracts all the features for the traces of an event log (each trace becomes a vector of vectors, where each event has its own vector)
Parameters: log – Event log
parameters –
- Parameters of the algorithm, including:
- STR_EVENT_ATTRIBUTES => string event attributes to consider in the features extraction
- NUM_EVENT_ATTRIBUTES => numeric event attributes to consider in the features extraction
- FEATURE_NAMES => features to consider (in the given order)
Returns: - data – Data to provide for decision tree learning
- feature_names – Names of the features, in order
-
pm4py.algo.transformation.log_to_features.variants.event_based.extract_all_ev_features_names_from_log(log: pm4py.objects.log.obj.EventLog, str_ev_attr: List[str], num_ev_attr: List[str], parameters: Optional[Dict[Union[str, pm4py.algo.transformation.log_to_features.variants.event_based.Parameters], Any]] = None) → List[str][source]¶ Extracts the feature names from an event log.
Parameters: log – Event log
str_ev_attr – (if provided) list of string event attributes to consider in extracting the feature names
num_ev_attr – (if provided) list of integer event attributes to consider in extracting the feature names
parameters –
- Parameters, including:
- MIN_NUM_DIFF_STR_VALUES => minimum number of distinct values to include an attribute as feature(s)
- MAX_NUM_DIFF_STR_VALUES => maximum number of distinct values to include an attribute as feature(s)
Returns: List of feature names
Return type: feature_names
-
pm4py.algo.transformation.log_to_features.variants.event_based.extract_features(log: pm4py.objects.log.obj.EventLog, feature_names: List[str], parameters: Optional[Dict[Union[str, pm4py.algo.transformation.log_to_features.variants.event_based.Parameters], Any]] = None) → Tuple[Any, List[str]][source]¶ Extracts the matrix of the features from an event log
Parameters: - log – Event log
- feature_names – Features to consider (in the given order)
Returns: - data – Data to provide for decision tree learning
- feature_names – Names of the features, in order
pm4py.algo.transformation.log_to_features.variants.trace_based module¶
This file is part of PM4Py (More Info: https://pm4py.fit.fraunhofer.de).
PM4Py is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
PM4Py is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with PM4Py. If not, see <https://www.gnu.org/licenses/>.
-
class
pm4py.algo.transformation.log_to_features.variants.trace_based.Parameters[source]¶ Bases:
enum.EnumAn enumeration.
-
ACTIVITY_KEY= 'pm4py:param:activity_key'¶
-
ADD_CASE_IDENTIFIER_COLUMN= 'add_case_identifier_column'¶
-
CASE_ATTRIBUTE_PREFIX= 'case:'¶
-
CASE_ID_KEY= 'pm4py:param:case_id_key'¶
-
DEFAULT_NOT_PRESENT= 'default_not_present'¶
-
ENABLE_ACTIVITY_DEF_REPRESENTATION= 'enable_activity_def_representation'¶
-
ENABLE_ALL_EXTRA_FEATURES= 'enable_all_extra_features'¶
-
ENABLE_CASE_DURATION= 'enable_case_duration'¶
-
ENABLE_DIRECT_PATHS_TIMES_LAST_OCC= 'enable_direct_paths_times_last_occ'¶
-
ENABLE_FIRST_LAST_ACTIVITY_INDEX= 'enable_first_last_activity_index'¶
-
ENABLE_INDIRECT_PATHS_TIMES_LAST_OCC= 'enable_indirect_paths_times_last_occ'¶
-
ENABLE_MAX_CONCURRENT_EVENTS= 'enable_max_concurrent_events'¶
-
ENABLE_MAX_CONCURRENT_EVENTS_PER_ACTIVITY= 'enable_max_concurrent_events_per_activity'¶
-
ENABLE_RESOURCE_WORKLOAD= 'enable_resource_workload'¶
-
ENABLE_SUCC_DEF_REPRESENTATION= 'enable_succ_def_representation'¶
-
ENABLE_TIMES_FROM_FIRST_OCCURRENCE= 'enable_times_from_first_occurrence'¶
-
ENABLE_TIMES_FROM_LAST_OCCURRENCE= 'enable_times_from_last_occurrence'¶
-
ENABLE_WORK_IN_PROGRESS= 'enable_work_in_progress'¶
-
EPSILON= 'epsilon'¶
-
FEATURE_NAMES= 'feature_names'¶
-
NUM_EVENT_ATTRIBUTES= 'num_ev_attr'¶
-
NUM_TRACE_ATTRIBUTES= 'num_tr_attr'¶
-
RESOURCE_KEY= 'pm4py:param:resource_key'¶
-
START_TIMESTAMP_KEY= 'pm4py:param:start_timestamp_key'¶
-
STR_EVENT_ATTRIBUTES= 'str_ev_attr'¶
-
STR_EVSUCC_ATTRIBUTES= 'str_evsucc_attr'¶
-
STR_TRACE_ATTRIBUTES= 'str_tr_attr'¶
-
TIMESTAMP_KEY= 'pm4py:param:timestamp_key'¶
-
-
pm4py.algo.transformation.log_to_features.variants.trace_based.apply(log: pm4py.objects.log.obj.EventLog, parameters: Optional[Dict[Union[str, pm4py.algo.transformation.log_to_features.variants.trace_based.Parameters], Any]] = None) → Tuple[Any, List[str]][source]¶ Extract the features from an event log (a vector for each trace)
Parameters: - log – Log
- parameters – Parameters of the algorithm, including: - STR_TRACE_ATTRIBUTES => string trace attributes to consider in the features extraction - STR_EVENT_ATTRIBUTES => string event attributes to consider in the features extraction - NUM_TRACE_ATTRIBUTES => numeric trace attributes to consider in the features extraction - NUM_EVENT_ATTRIBUTES => numeric event attributes to consider in the features extraction - STR_EVSUCC_ATTRIBUTES => succession of event attributes to consider in the features extraction - FEATURE_NAMES => features to consider (in the given order) - ENABLE_ALL_EXTRA_FEATURES => enables all the extra features - ENABLE_CASE_DURATION => enables the case duration as additional feature - ENABLE_TIMES_FROM_FIRST_OCCURRENCE => enables the addition of the times from start of the case, to the end of the case, from the first occurrence of an activity of a case - ADD_CASE_IDENTIFIER_COLUMN => adds the case identifier (string) as column of the feature table (default: False) - ENABLE_TIMES_FROM_LAST_OCCURRENCE => enables the addition of the times from start of the case, to the end of the case, from the last occurrence of an activity of a case - ENABLE_DIRECT_PATHS_TIMES_LAST_OCC => add the duration of the last occurrence of a directed (i, i+1) path in the case as feature - ENABLE_INDIRECT_PATHS_TIMES_LAST_OCC => add the duration of the last occurrence of an indirect (i, j) path in the case as feature - ENABLE_WORK_IN_PROGRESS => enables the work in progress (number of concurrent cases) as a feature - ENABLE_RESOURCE_WORKLOAD => enables the resource workload as a feature - ENABLE_FIRST_LAST_ACTIVITY_INDEX => enables the insertion of the indexes of the activities as features - ENABLE_MAX_CONCURRENT_EVENTS => enables the count of the number of concurrent events inside a case - ENABLE_MAX_CONCURRENT_EVENTS_PER_ACTIVITY => enables the count of the number of concurrent events per activity
Returns: - data – Data to provide for decision tree learning
- feature_names – Names of the features, in order
-
pm4py.algo.transformation.log_to_features.variants.trace_based.case_duration(log: pm4py.objects.log.obj.EventLog, parameters: Optional[Dict[Union[str, pm4py.algo.transformation.log_to_features.variants.trace_based.Parameters], Any]] = None) → Tuple[Any, List[str]][source]¶ Calculates for each case, the case duration (and adds it as a feature)
Parameters: - log – Event log
- parameters – Parameters of the algorithm
Returns: - data – Numeric value of the features
- feature_names – Names of the features
-
pm4py.algo.transformation.log_to_features.variants.trace_based.direct_paths_times_last_occ(log: pm4py.objects.log.obj.EventLog, parameters: Optional[Dict[Union[str, pm4py.algo.transformation.log_to_features.variants.trace_based.Parameters], Any]] = None) → Tuple[Any, List[str]][source]¶ Calculates for each case, and for each direct path of the case, the difference between the start timestamp of the later event and the completion timestamp of the first event. Defaults if a path is not present in a case.
Parameters: - log – Event log
- parameters – Parameters of the algorithm
Returns: - data – Numeric value of the features
- feature_names – Names of the features
-
pm4py.algo.transformation.log_to_features.variants.trace_based.first_last_activity_index_trace(log: pm4py.objects.log.obj.EventLog, parameters: Optional[Dict[Union[str, pm4py.algo.transformation.log_to_features.variants.trace_based.Parameters], Any]] = None) → Tuple[Any, List[str]][source]¶ Consider as features the first and the last index of an activity inside a case
Parameters: - log – Event log
- parameters – Parameters, including: - Parameters.ACTIVITY_KEY => the attribute to use as activity - Parameters.DEFAULT_NOT_PRESENT => the replacement value for activities that are not present for the specific case
Returns: - data – Numeric value of the features
- feature_names – Names of the features
-
pm4py.algo.transformation.log_to_features.variants.trace_based.get_all_string_event_attribute_values(log: pm4py.objects.log.obj.EventLog, event_attribute: str) → List[str][source]¶ Get all the representations for all the traces of the log associated to a string event attribute values
Parameters: - log – Trace of the log
- event_attribute – Event attribute to consider
Returns: All feature names present for the given attribute in the given log
Return type: values
-
pm4py.algo.transformation.log_to_features.variants.trace_based.get_all_string_event_succession_attribute_values(log: pm4py.objects.log.obj.EventLog, event_attribute: str) → List[str][source]¶ Get all the representations for all the traces of the log associated to a string event attribute succession values
Parameters: - log – Trace of the log
- event_attribute – Event attribute to consider
Returns: All feature names present for the given attribute succession in the given log
Return type: values
-
pm4py.algo.transformation.log_to_features.variants.trace_based.get_all_string_trace_attribute_values(log: pm4py.objects.log.obj.EventLog, trace_attribute: str) → List[str][source]¶ Get all string trace attribute values representations for a log
Parameters: - log – Trace log
- trace_attribute – Attribute of the trace to consider
Returns: List containing for each trace a representation of the feature name associated to the attribute
Return type: list
-
pm4py.algo.transformation.log_to_features.variants.trace_based.get_default_representation(log: pm4py.objects.log.obj.EventLog, parameters: Optional[Dict[Union[str, pm4py.algo.transformation.log_to_features.variants.trace_based.Parameters], Any]] = None, feature_names: Optional[List[str]] = None) → Tuple[Any, List[str]][source]¶ Gets the default data representation of an event log (for process tree building)
Parameters: - log – Trace log
- parameters – Possible parameters of the algorithm
- feature_names – (If provided) Feature to use in the representation of the log
Returns: - data – Data to provide for decision tree learning
- feature_names – Names of the features, in order
-
pm4py.algo.transformation.log_to_features.variants.trace_based.get_default_representation_with_attribute_names(log: pm4py.objects.log.obj.EventLog, parameters: Optional[Dict[Union[str, pm4py.algo.transformation.log_to_features.variants.trace_based.Parameters], Any]] = None, feature_names: Optional[List[str]] = None) → Tuple[Any, List[str], List[str], List[str], List[str], List[str]][source]¶ Gets the default data representation of an event log (for process tree building) returning also the attribute names
Parameters: - log – Trace log
- parameters – Possible parameters of the algorithm
- feature_names – (If provided) Feature to use in the representation of the log
Returns: - data – Data to provide for decision tree learning
- feature_names – Names of the features, in order
-
pm4py.algo.transformation.log_to_features.variants.trace_based.get_numeric_event_attribute_rep(event_attribute: str) → str[source]¶ Get the feature name associated to a numeric event attribute
Parameters: event_attribute – Name of the event attribute Returns: Name of the feature Return type: feature_name
-
pm4py.algo.transformation.log_to_features.variants.trace_based.get_numeric_event_attribute_value(event: pm4py.objects.log.obj.Event, event_attribute: str) → Union[int, float][source]¶ Get the value of a numeric event attribute from a given event
Parameters: event – Event Returns: Value of the numeric event attribute for the given event Return type: value
-
pm4py.algo.transformation.log_to_features.variants.trace_based.get_numeric_event_attribute_value_trace(trace: pm4py.objects.log.obj.Trace, event_attribute: str) → Union[int, float][source]¶ Get the value of the last occurrence of a numeric event attribute given a trace
Parameters: trace – Trace of the log Returns: Value of the last occurrence of a numeric trace attribute for the given trace Return type: value
-
pm4py.algo.transformation.log_to_features.variants.trace_based.get_numeric_trace_attribute_rep(trace_attribute: str) → str[source]¶ Get the feature name associated to a numeric trace attribute
Parameters: trace_attribute – Name of the trace attribute Returns: Name of the feature Return type: feature_name
-
pm4py.algo.transformation.log_to_features.variants.trace_based.get_numeric_trace_attribute_value(trace: pm4py.objects.log.obj.Trace, trace_attribute: str) → Union[int, float][source]¶ Get the value of a numeric trace attribute from a given trace
Parameters: trace – Trace of the log Returns: Value of the numeric trace attribute for the given trace Return type: value
-
pm4py.algo.transformation.log_to_features.variants.trace_based.get_representation(log: pm4py.objects.log.obj.EventLog, str_tr_attr: List[str], str_ev_attr: List[str], num_tr_attr: List[str], num_ev_attr: List[str], str_evsucc_attr: Optional[List[str]] = None, feature_names: Optional[List[str]] = None) → Tuple[Any, List[str]][source]¶ Get a representation of the event log that is suited for the data part of the decision tree learning
NOTE: this function only encodes the last value seen for each attribute
Parameters: - log – Trace log
- str_tr_attr – List of string trace attributes to consider in data vector creation
- str_ev_attr – List of string event attributes to consider in data vector creation
- num_tr_attr – List of numeric trace attributes to consider in data vector creation
- num_ev_attr – List of numeric event attributes to consider in data vector creation
- str_evsucc_attr – List of attributes succession of values to consider in data vector creation
- feature_names – (If provided) Feature to use in the representation of the log
Returns: - data – Data to provide for decision tree learning
- feature_names – Names of the features, in order
-
pm4py.algo.transformation.log_to_features.variants.trace_based.get_string_event_attribute_rep(event: pm4py.objects.log.obj.Event, event_attribute: str) → str[source]¶ Get a representation of the feature name associated to a string event attribute value
Parameters: - event – Single event of a trace
- event_attribute – Event attribute to consider
Returns: Representation of the feature name associated to a string event attribute value
Return type: rep
-
pm4py.algo.transformation.log_to_features.variants.trace_based.get_string_event_attribute_succession_rep(event1: pm4py.objects.log.obj.Event, event2: pm4py.objects.log.obj.Event, event_attribute: str) → str[source]¶ Get a representation of the feature name associated to a string event attribute value
Parameters: - event1 – First event of the succession
- event2 – Second event of the succession
- event_attribute – Event attribute to consider
Returns: Representation of the feature name associated to a string event attribute value
Return type: rep
-
pm4py.algo.transformation.log_to_features.variants.trace_based.get_string_trace_attribute_rep(trace: pm4py.objects.log.obj.Trace, trace_attribute: str) → str[source]¶ Get a representation of the feature name associated to a string trace attribute value
Parameters: - trace – Trace of the log
- trace_attribute – Attribute of the trace to consider
Returns: Representation of the feature name associated to a string trace attribute value
Return type: rep
-
pm4py.algo.transformation.log_to_features.variants.trace_based.get_values_event_attribute_for_trace(trace: pm4py.objects.log.obj.Trace, event_attribute: str) → Set[str][source]¶ Get all the representations for the events of a trace associated to a string event attribute values
Parameters: - trace – Trace of the log
- event_attribute – Event attribute to consider
Returns: All feature names present for the given attribute in the given trace
Return type: values
-
pm4py.algo.transformation.log_to_features.variants.trace_based.get_values_event_attribute_succession_for_trace(trace: pm4py.objects.log.obj.Trace, event_attribute: str) → Set[str][source]¶ Get all the representations for the events of a trace associated to a string event attribute succession values
Parameters: - trace – Trace of the log
- event_attribute – Event attribute to consider
Returns: All feature names present for the given attribute succession in the given trace
Return type: values
-
pm4py.algo.transformation.log_to_features.variants.trace_based.indirect_paths_times_last_occ(log: pm4py.objects.log.obj.EventLog, parameters: Optional[Dict[Union[str, pm4py.algo.transformation.log_to_features.variants.trace_based.Parameters], Any]] = None) → Tuple[Any, List[str]][source]¶ Calculates for each case, and for each indirect path of the case, the difference between the start timestamp of the later event and the completion timestamp of the first event. Defaults if a path is not present in a case.
Parameters: - log – Event log
- parameters – Parameters of the algorithm
Returns: - data – Numeric value of the features
- feature_names – Names of the features
-
pm4py.algo.transformation.log_to_features.variants.trace_based.max_concurrent_events(log: pm4py.objects.log.obj.EventLog, parameters: Optional[Dict[Union[str, pm4py.algo.transformation.log_to_features.variants.trace_based.Parameters], Any]] = None) → Tuple[Any, List[str]][source]¶ Counts for every trace the maximum number of events (of any activity) that happen concurrently (e.g., their time intervals [st1, ct1] and [st2, ct2] have non-empty intersection).
Parameters: - log – Event log
- parameters – Parameters of the algorithm
Returns: - data – Numeric value of the features
- feature_names – Names of the features
-
pm4py.algo.transformation.log_to_features.variants.trace_based.max_concurrent_events_per_activity(log: pm4py.objects.log.obj.EventLog, parameters: Optional[Dict[Union[str, pm4py.algo.transformation.log_to_features.variants.trace_based.Parameters], Any]] = None) → Tuple[Any, List[str]][source]¶ Counts for every trace and every activity the maximum number of events of the given activity that happen concurrently (e.g., their time intervals [st1, ct1] and [st2, ct2] have non-empty intersection).
Parameters: - log – Event log
- parameters – Parameters of the algorithm
Returns: - data – Numeric value of the features
- feature_names – Names of the features
-
pm4py.algo.transformation.log_to_features.variants.trace_based.resource_workload(log: pm4py.objects.log.obj.EventLog, parameters: Optional[Dict[Union[str, pm4py.algo.transformation.log_to_features.variants.trace_based.Parameters], Any]] = None) → Tuple[Any, List[str]][source]¶ Calculates for each case, and for each resource of the log, the workload of the resource during the lead time of a case. Defaults if a resource is not contained in a case.
Parameters: - log – Event log
- parameters – Parameters of the algorithm
Returns: - data – Numeric value of the features
- feature_names – Names of the features
-
pm4py.algo.transformation.log_to_features.variants.trace_based.times_from_first_occurrence_activity_case(log: pm4py.objects.log.obj.EventLog, parameters: Optional[Dict[Union[str, pm4py.algo.transformation.log_to_features.variants.trace_based.Parameters], Any]] = None) → Tuple[Any, List[str]][source]¶ Calculates for each case, and for each activity, the times from the start to the case, and to the end of the case, from the first occurrence of the activity in the case.
Parameters: - log – Event log
- parameters – Parameters of the algorithm
Returns: - data – Numeric value of the features
- feature_names – Names of the features
-
pm4py.algo.transformation.log_to_features.variants.trace_based.times_from_last_occurrence_activity_case(log: pm4py.objects.log.obj.EventLog, parameters: Optional[Dict[Union[str, pm4py.algo.transformation.log_to_features.variants.trace_based.Parameters], Any]] = None) → Tuple[Any, List[str]][source]¶ Calculates for each case, and for each activity, the times from the start to the case, and to the end of the case, from the last occurrence of the activity in the case.
Parameters: - log – Event log
- parameters – Parameters of the algorithm
Returns: - data – Numeric value of the features
- feature_names – Names of the features
-
pm4py.algo.transformation.log_to_features.variants.trace_based.work_in_progress(log: pm4py.objects.log.obj.EventLog, parameters: Optional[Dict[Union[str, pm4py.algo.transformation.log_to_features.variants.trace_based.Parameters], Any]] = None) → Tuple[Any, List[str]][source]¶ Calculates for each case, and for each resource of the log, the number of cases which are open during the lead time of the case.
Parameters: - log – Event log
- parameters – Parameters of the algorithm
Returns: - data – Numeric value of the features
- feature_names – Names of the features
Module contents¶
This file is part of PM4Py (More Info: https://pm4py.fit.fraunhofer.de).
PM4Py is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
PM4Py is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with PM4Py. If not, see <https://www.gnu.org/licenses/>.