itemset_mining.two_phase_huim module

itemset_mining.two_phase_huim.CandidateHUIRecord

alias of itemset_mining.two_phase_huim.HUIRecord

class itemset_mining.two_phase_huim.HUIRecord(items, itemset_utility)

Bases: tuple

property items

Alias for field number 0

property itemset_utility

Alias for field number 1

class itemset_mining.two_phase_huim.TwoPhase(transactions: Union[Generator[Tuple[str, Union[int, float]], None, None], List[Tuple[str, Union[int, float]]]], external_utilities: Dict, minutil: int)[source]

Bases: object

Example

>>> from itemset_mining.two_phase_huim import TwoPhase
>>> from operator import attrgetter
>>> transactions = [
...     [("Coke 12oz", 6), ("Chips", 2), ("Dip", 1)],
...     [("Coke 12oz", 1)],
...     [("Coke 12oz", 2), ("Chips", 1)],
...     [("Chips", 1)],
...     [("Chips", 2)],
...     [("Coke 12oz", 6), ("Chips", 1)]
... ]
>>> # ARP for each item
>>> external_utilities = {
...     "Coke 12oz": 1.29,
...     "Chips": 2.99,
...     "Dip": 3.49
... }
>>> # Minimum dollar value generated by an itemset we care about across all transactions
>>> minutil = 20.00
>>> hui = TwoPhase(transactions, external_utilities, minutil)
>>> result = hui.get_hui()
>>> sorted(result, key=attrgetter('itemset_utility'), reverse=True)
... 
[HUIRecord(items=('Chips', 'Coke 12oz'), itemset_utility=30.02),
 HUIRecord(items=('Chips',), itemset_utility=20.93)]
calc_itemset_utility(itemset: Tuple)[source]
calc_transaction_utility(transaction: Tuple[Any, Union[int, float]])[source]
calc_twu(itemset: List[str])[source]

Calculated the transaction-weighted utilization for an itemset

classmethod ensure_transaction_items_unique(transactions)[source]

Ensure there are no duplicate items within a transaction,

while preserving generators.

get_high_twu_itemsets(max_length: Optional[int] = None) → Generator[itemset_mining.two_phase_huim.HUIRecord, None, None][source]

Returns a generator of support records with given transactions.

This is “Phase 1.”

get_hui(max_length: Optional[int] = None) → Generator[itemset_mining.two_phase_huim.HUIRecord, None, None][source]
initial_candidates()[source]

Returns the initial candidates.

property items

Returns the list of items that appear in transactions.