itemset_mining.two_phase_huim module¶
-
itemset_mining.two_phase_huim.
CandidateHUIRecord
¶
-
class
itemset_mining.two_phase_huim.
HUIRecord
(items, itemset_utility)¶ Bases:
tuple
-
property
items
¶ Alias for field number 0
-
property
itemset_utility
¶ Alias for field number 1
-
property
-
class
itemset_mining.two_phase_huim.
TwoPhase
(transactions: Union[Generator[Tuple[str, Union[int, float]], None, None], List[Tuple[str, Union[int, float]]]], external_utilities: Dict, minutil: int)[source]¶ Bases:
object
Example
>>> from itemset_mining.two_phase_huim import TwoPhase >>> from operator import attrgetter >>> transactions = [ ... [("Coke 12oz", 6), ("Chips", 2), ("Dip", 1)], ... [("Coke 12oz", 1)], ... [("Coke 12oz", 2), ("Chips", 1)], ... [("Chips", 1)], ... [("Chips", 2)], ... [("Coke 12oz", 6), ("Chips", 1)] ... ] >>> # ARP for each item >>> external_utilities = { ... "Coke 12oz": 1.29, ... "Chips": 2.99, ... "Dip": 3.49 ... } >>> # Minimum dollar value generated by an itemset we care about across all transactions >>> minutil = 20.00
>>> hui = TwoPhase(transactions, external_utilities, minutil) >>> result = hui.get_hui() >>> sorted(result, key=attrgetter('itemset_utility'), reverse=True) ... [HUIRecord(items=('Chips', 'Coke 12oz'), itemset_utility=30.02), HUIRecord(items=('Chips',), itemset_utility=20.93)]
-
calc_twu
(itemset: List[str])[source]¶ Calculated the transaction-weighted utilization for an itemset
-
classmethod
ensure_transaction_items_unique
(transactions)[source]¶ Ensure there are no duplicate items within a transaction,
while preserving generators.
-
get_high_twu_itemsets
(max_length: Optional[int] = None) → Generator[itemset_mining.two_phase_huim.HUIRecord, None, None][source]¶ Returns a generator of support records with given transactions.
This is “Phase 1.”
-
get_hui
(max_length: Optional[int] = None) → Generator[itemset_mining.two_phase_huim.HUIRecord, None, None][source]¶
-
property
items
¶ Returns the list of items that appear in transactions.
-