Unofficial Apyori Documentation
The post below reflects my unofficial docs for the pip-installable Apyori package (on pypi, on github). I am just a fan of the project and Association Rule Learning generally, so thought I’d write up some notes for the community below. I am in no way associated with the project, and would like the thank ymoch for all their hard work. If you find any inaccuracies below, please leave a comment.

Loading Transactions (API usage)

Transactions should be an interable of iterables (e.g. List of lists). For transactions stored in this format in a variable, apriori() can be called directly on this object. However, if you want to load transactions from files you should use:
The result of the object loaded from the file will be a generator for the transactions.  To view the transactions, you can convert to a list:
Note: Avoid using syntax such as load_transaction(‘/path/to/file’). To maintain flexibility to accept path-like objects, such syntax will behave unexpectedly.


Advanced Usage
Under the hood this function is using Python’s built-in csv.reader. Accordingly, load_transaction can accept any kwarg accepted by csv.reader. This is particularly important for the delimiter, as load_transaction’s default delimiter is for tabs only.

Apriori (API usage)

Running the Apriori algorithm on your transactions is as simple as:
The algorithm has four parameters: min_support (defaults to 0.1), min_confidence (defaults to 0.0), min_lift (defaults to 0.0), and max_length (defaults to None). A realistic parameterization (depending heavily on your data and use case) might look like:
What’s returned is a generator of your results. If your data fits into memory and you’d prefer to interact with it that way, you can create a list from the results. E.g.:


Full Example


CLI Usage

The official documentation provides adequate coverage of the CLI usage.

Understanding Apriori Output

Important Note: Before proceeding beyond this point, please make sure you understand how the algorithm works and all of its parameters. I have given a couple of beginner-level presentations on Association Rule Learning, with in-depth explanations of the Apriori algorithm, slides for which can be found here. There are links to additional resources in the presentation.
Looking at the example found in the docs:
Our results would appear as a list containing multiple entries such as the one that follows:
Each RelationRecord  reflects all rules associated with a specific itemset (items) that has relevant rules. Support (support ), given that it’s simply a count of appearances of those items together, is the same for any rules involving those items, and so only appears once per RelationRecord. The ordered_statistic  reflects a list of all rules that met our min_confidence  and min_lift  requirements (parameterized when we called apriori() ). Each OrderedStatistic  contains the antecedent (items_base) and consequent (items_add) for the rule, as well as the associated confidence  and lift .

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.