In development
- Fixes a bug where
show_percentages
used the incorrect denominator if filtering (e.g.min_subset_size
) was applied. This bug was a regression introduced in version 0.7. (248
) - Align ylabels of subplots added using
add_catplot
. (266
) - Add a
style_categories
method to customize category plot styles, including shading of rows in the intersection matrix, and bars in the totals plot. (261
with thanks toMarcel Albus <maralbus>
). - Ability to disable totals plot with totals_plot_elements=0. (
246
) - Ability to set totals y axis label (
243
) - Added
max_subset_rank
to get only n most populous subsets. (253
) - Added support for
min_subset_size
andmax_subset_size
specified as percentage. (264
)
- Allowed
show_percentages
to be provided with a custom formatting string, for example to show more decimal places. (194
) - Added include_empty_subsets to UpSet and query to allow the display of all possible subsets. (
185
) - sort_by and sort_categories_by now accept '-' prefix to their values to sort in reverse. 'input' and '-input' are also supported. (
180
) - Added subsets attribute to QueryResult. (
198
) - Fixed a bug where more than 64 categories could result in an error. (
193
)
Patch release 0.8.2 handles deprecations in dependencies.
- Added query function to support analysing set-based data.
- Fixed support for matplotlib >3.5.2 (
191
. ThanksGuyTeichman
)
- Added add_stacked_bars, similar to add_catplot but to add stacked bar charts to show discrete variable distributions within each subset. (
137
) - Improved ability to control colors, and added a new example of same. Parameters
other_dots_color
andshading_color
were added.facecolor
will now default to white ifmatplotlib.rcParams['axes.facecolor']
is dark. (138
) - Added style_subsets to colour intersection size bars and matrix dots in the plot according to a specified query. (
152
) - Added from_indicators to allow yet another data input format. This allows category membership to be easily derived from a DataFrame, such as when plotting missing values in the columns of a DataFrame. (
143
)
- Support using input intersection order with
sort_by=None
(133
with thanks toBrandon B <outlace>
). - Add parameters for filtering by subset size (with thanks to
Sichong Peng <SichongP>
) and degree. (134
) - Fixed an issue where tick labels were not given enough space and overlapped category totals. (
132
) - Fixed an issue where our implementation of
sort_by='degree'
apparently gave incorrect results for some inputs and versions of Pandas. (134
)
- Fixed a regresion which caused the first column to be hidden (
125
)
- Fixed issue with the order of catplots being reversed for vertical plots (
122
with thanks toEnrique Fernandez-Blanco <ennanco>
) - Fixed issue with the x limits of vertical plots (
121
).
- Fixed large x-axis plot margins with high number of unique intersections (
106
with thanks toYidi Huang <huangy6>
)
- Fixed the calculation of percentage which was broken in 0.4.0. (
101
)
- Added option to display both the absolute frequency and the percentage of the total for each intersection and category. (
89
with thanks toCarlos Melus <maziello>
andAaron Rosenfeld <arosenfeld>
) - Improved efficiency where there are many categories, but valid combinations are sparse, if sort_by='degree'. (
82
) - Permit truthy (not necessarily bool) values in index. (
74
with thanks toZaxR
) - intersection_plot_elements can now be set to 0 to hide the intersection size plot when add_catplot is used. (
80
)
- Added from_contents to provide an alternative, intuitive way of specifying category membership of elements.
- To improve code legibility and intuitiveness, sum_over=False was deprecated and a subset_size parameter was added. It will have better default handling of DataFrames after a short deprecation period.
- generate_data has been replaced with generate_counts and generate_samples.
- Fixed the display of the "intersection size" label on plots, which had been missing.
- Trying to improve nomenclature, upsetplot now avoids "set" to refer to the top-level sets, which are now to be known as "categories". This matches the intuition that categories are named, logical groupings, as opposed to "subsets". To this end:
- generate_counts (formerly generate_data) now names its categories "cat1", "cat2" etc. rather than "set1", "set2", etc.
- the sort_sets_by parameter has been renamed to sort_categories_by and will be removed in version 0.4.
- Return a Series (not a DataFrame) from from_memberships if data is 1-dimensional.
- Added from_memberships to allow a more convenient data input format.
- plot and UpSet now accept a pandas.DataFrame as input, if the sum_over parameter is also given.
- Added an add_catplot method to UpSet which adds Seaborn plots of set intersection data to show more than just set size or total.
- Shading of subset matrix is continued through to totals.
- Added a show_counts option to show counts at the ends of bar plots. (
5
) - Defined _repr_html_ so that an UpSet object will render in Jupyter notebooks. (
36
) - Fix a bug where an error was raised if an input set was empty.