Add support to transformed_data for reconstructed (from_dict) charts #354

binste · 2023-07-07T12:17:08Z

Relates to #313. For layered and multi-view charts, Altair takes a bit of a shortcut and does not use the exact classes which map to the Vega-Lite schema but instead uses the API convenience classes such as alt.Chart, alt.LayerChart, etc. For example for the layered chart below, the elements of layer are alt.Chart instances:

import altair as alt
import vegafusion as vf
from vega_datasets import data

source = data.wheat()

bar = alt.Chart(source).mark_bar().encode(
    x='year:O',
    y='wheat:Q'
)

rule = alt.Chart(source).mark_rule(color='red').encode(
    y='mean(wheat):Q'
)

layer_chart = (bar + rule).properties(width=600)
type(layer_chart.layer[0])  # altair.vegalite.v5.api.Chart

However, when reconstructing this chart from a dictionary, the layers are of type UnitSpec which is the correct class following the VL schema:

reconstructed_layer_chart = alt.Chart.from_dict(layer_chart.to_dict())
type(reconstructed_layer_chart.layer[0])  # altair.vegalite.v5.schema.core.UnitSpec

When calling transformed_data on the reconstructed chart, it fails as the isinstance checks do not accept UnitSpec as an equivalent to alt.Chart although it should be:

vf.transformed_data(reconstructed_layer_chart)

ValueError: transformed_data accepts an instance of Chart, FacetChart, LayerChart, HConcatChart, VConcatChart, or ConcatChart
Received value of type: <class 'altair.vegalite.v5.schema.core.UnitSpec'>

All classes which I think can be treated as equivalent to the first one mentioned in every bullet:

Chart: TopLevelUnitSpec (parent class of Chart), FacetedUnitSpec, UnitSpec, UnitSpecWithFrame, NonNormalizedSpec (used in concat charts for chart objects in e.g. hconcat attribute)
LayerChart: TopLevelLayerSpec (parent class of LayerChart), LayerSpec
RepeatChart: TopLevelRepeatSpec (parent class of RepeatChart), RepeatSpec
ConcatChart: TopLevelConcatSpec (parent class of ConcatChart), ConcatSpecGenericSpec
HConcatChart: TopLevelHConcatSpec (parent class of HConcatChart), HConcatSpecGenericSpec
VConcatChart: TopLevelVConcatSpec (parent class of VConcatChart), VConcatSpecGenericSpec
FacetChart: TopLevelFacetSpec (parent class of FacetChart), FacetSpec

Spec classes which I'm not sure about but we can probably ignore as repeat charts are not supported yet by transformed_data anyway:

LayerRepeatSpec
NonLayerRepeatSpec
GenericUnitSpecEncodingAnyMark

Just a reminder that the same changes would need to be applied to https://github.com/altair-viz/altair/blob/master/altair/utils/_transformed_data.py.

Btw, my use case for the above is that a frontend sends the chart specifications as a JSON to a backend API which executes vf.transformed_data(alt.Chart.from_dict(spec)) and then returns the extracted data as an Excel file to the user of the web application.

The text was updated successfully, but these errors were encountered:

jonmmease · 2023-07-07T13:06:01Z

Thanks for the report @binste, I think this all makes sense. And now that chart.transformed_data is in Altair master, we need to fix it there as well!

binste · 2023-07-07T13:47:32Z

I don't have a working development setup for VegaFusion right now. Is it easy to set it up to work purely on the Python part without Rust or do I need to compile the Rust code anyway? I assume the later. Let me know if I should land a hand on implementing the changes in Altair.

jonmmease · 2023-07-07T13:50:02Z

Let me know if I should land a hand on implementing the changes in Altair.

You can develop the Python part of VegaFusion on its own. But if you want to work on fixing it in the Altair implementation of chart.transformed_data (which doesn't use the vf.transformed_data implementation), I could pretty easily copy the fixes into the VegaFusion implementation afterward.

binste · 2023-07-07T15:03:14Z

Great, I'll work on the Altair implementation probably early next week.

…am` test Struggling to find the source of the failure, the `mark` is my best guess but currently can't reproduce the following error: ``` ___________________________________________ test_compound_chart_examples[False-scatter_with_layered_histogram.py-all_rows18-all_cols18] ___________________________________________ [gw2] win32 -- Python 3.8.19 C:\Users\*\AppData\Local\hatch\env\virtual\altair\CXM7NV9I\hatch-test.py3.8\Scripts\python.exe filename = 'scatter_with_layered_histogram.py', all_rows = [2, 17], all_cols = [['gender'], ['__count']], to_reconstruct = False @pytest.mark.skipif(vf is None, reason="vegafusion not installed") # fmt: off @pytest.mark.parametrize("filename,all_rows,all_cols", [ ("errorbars_with_std.py", [10, 10], [["upper_yield"], ["extent_yield"]]), ("candlestick_chart.py", [44, 44], [["low"], ["close"]]), ("co2_concentration.py", [713, 7, 7], [["first_date"], ["scaled_date"], ["end"]]), ("falkensee.py", [2, 38, 38], [["event"], ["population"], ["population"]]), ("heat_lane.py", [10, 10], [["bin_count_start"], ["y2"]]), ("histogram_responsive.py", [20, 20], [["__count"], ["__count"]]), ("histogram_with_a_global_mean_overlay.py", [9, 1], [["__count"], ["mean_IMDB_Rating"]]), ("horizon_graph.py", [20, 20], [["x"], ["ny"]]), ("interactive_cross_highlight.py", [64, 64, 13], [["__count"], ["__count"], ["Major_Genre"]]), ("interval_selection.py", [123, 123], [["price_start"], ["date"]]), ("layered_chart_with_dual_axis.py", [12, 12], [["month_date"], ["average_precipitation"]]), ("layered_heatmap_text.py", [9, 9], [["Cylinders"], ["mean_horsepower"]]), ("multiline_highlight.py", [560, 560], [["price"], ["date"]]), ("multiline_tooltip.py", [300, 300, 300, 0, 300], [["x"], ["y"], ["y"], ["x"], ["x"]]), ("pie_chart_with_labels.py", [6, 6], [["category"], ["value"]]), ("radial_chart.py", [6, 6], [["values"], ["values_start"]]), ("scatter_linked_table.py", [392, 14, 14, 14], [["Year"], ["Year"], ["Year"], ["Year"]]), ("scatter_marginal_hist.py", [34, 150, 27], [["__count"], ["species"], ["__count"]]), ("scatter_with_layered_histogram.py", [2, 17], [["gender"], ["__count"]]), ("scatter_with_minimap.py", [1461, 1461], [["date"], ["date"]]), ("scatter_with_rolling_mean.py", [1461, 1461], [["date"], ["rolling_mean"]]), ("seattle_weather_interactive.py", [1461, 5], [["date"], ["__count"]]), ("select_detail.py", [20, 1000], [["id"], ["x"]]), ("simple_scatter_with_errorbars.py", [5, 5], [["x"], ["upper_ymin"]]), ("stacked_bar_chart_with_text.py", [60, 60], [["site"], ["site"]]), ("us_employment.py", [120, 1, 2], [["month"], ["president"], ["president"]]), ("us_population_pyramid_over_time.py", [19, 38, 19], [["gender"], ["year"], ["gender"]]), ]) # fmt: on @pytest.mark.parametrize("to_reconstruct", [True, False]) def test_compound_chart_examples(filename, all_rows, all_cols, to_reconstruct): source = pkgutil.get_data(examples_methods_syntax.__name__, filename) chart = eval_block(source) if to_reconstruct: # When reconstructing a Chart, Altair uses different classes # then what might have been originally used. See # vega/vegafusion#354 for more info. chart = alt.Chart.from_dict(chart.to_dict()) dfs = chart.transformed_data() if not to_reconstruct: # Only run assert statements if the chart is not reconstructed. Reason # is that for some charts, the original chart contained duplicated datasets # which disappear when reconstructing the chart. assert len(dfs) == len(all_rows) for df, rows, cols in zip(dfs, all_rows, all_cols): > assert len(df) == rows E assert 19 == 17 E + where 19 = len( bin_step_5_age bin_step_5_age_end gender __count\n0 45.0 50.0 M 247\n1 ... 11\n17 30.0 35.0 F 5\n18 70.0 75.0 F 1) tests\test_transformed_data.py:132: AssertionError ```

jonmmease added the bug Something isn't working label Jul 7, 2023

binste mentioned this issue Jul 10, 2023

Add support to transformed_data for reconstructed charts (with from_dict/from_json) vega/altair#3102

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support to transformed_data for reconstructed (from_dict) charts #354

Add support to transformed_data for reconstructed (from_dict) charts #354

binste commented Jul 7, 2023 •

edited

jonmmease commented Jul 7, 2023

binste commented Jul 7, 2023

jonmmease commented Jul 7, 2023

binste commented Jul 7, 2023

Add support to transformed_data for reconstructed (from_dict) charts #354

Add support to transformed_data for reconstructed (from_dict) charts #354

Comments

binste commented Jul 7, 2023 • edited

jonmmease commented Jul 7, 2023

binste commented Jul 7, 2023

jonmmease commented Jul 7, 2023

binste commented Jul 7, 2023

binste commented Jul 7, 2023 •

edited