Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Table.print_table errors after homogenize with ints #700

Closed
nbedi opened this issue Nov 21, 2017 · 2 comments
Closed

Table.print_table errors after homogenize with ints #700

nbedi opened this issue Nov 21, 2017 · 2 comments
Labels

Comments

@nbedi
Copy link
Member

nbedi commented Nov 21, 2017

Found while investigating #699

Seems related to #411. My guess is homogenizing isn't converting int values from default_row into Decimal.

@jpmckinney
Copy link
Member

jpmckinney commented Jul 14, 2021

I'm not sure that this is a bug. agate doesn't really have much functionality for adding explicit values to an existing table. The user must provide a default row with decimals.

This errors:

import agate
data = [
    {'year': 1997, 'female_count': 2, 'male_count': 1},
    {'year': 2000, 'female_count': 3, 'male_count': 3}
]
key = 'year'
expected_values = (1997, 1998, 2000)
default_row = (0, 0)
t = agate.Table.from_object(data)
t.homogenize(key, expected_values, default_row).print_table()

This works:

import agate
from decimal import Decimal
data = [
    {'year': 1997, 'female_count': 2, 'male_count': 1},
    {'year': 2000, 'female_count': 3, 'male_count': 3}
]
key = 'year'
expected_values = [Decimal(num) for num in (1997, 1998, 2000)]
default_row = [Decimal(num) for num in (0, 0)]
t = agate.Table.from_object(data)
t.homogenize(key, expected_values, default_row).print_table()

@jpmckinney
Copy link
Member

jpmckinney commented Jul 14, 2021

I got part-way with this patch:

diff --git a/agate/table/homogenize.py b/agate/table/homogenize.py
index 0e04ec4..57ceaf4 100644
--- a/agate/table/homogenize.py
+++ b/agate/table/homogenize.py
@@ -53,8 +53,16 @@ def homogenize(self, key, compare_values, default_row=None):
         if any(not utils.issequence(compare_value) for compare_value in compare_values):
             compare_values = [[compare_value] for compare_value in compare_values]
 
-    column_values = [self._columns.get(name) for name in key]
-    column_indexes = [self._column_names.index(name) for name in key]
+    column_values = []
+    column_indexes = []
+    for i, name in enumerate(key):
+        column = self._columns.get(name)
+        column_values.append(column)
+        column_indexes.append(self._column_names.index(name))
+        for values in compare_values:
+            values = list(values)
+            if i < len(values):
+                values[i] = column.data_type.cast(values[i])
 
     column_values = zip(*column_values)
     differences = list(set(map(tuple, compare_values)) - set(column_values))

However, it causes some tests to fail. It also doesn't deal with the default row, which is much more complicated to cast, since you'd have to map the default row's indices to the table's indices to cast them correctly.

To test, we can just add homogenized.print_table() in test_homogenize.py.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants