8000 test failure in test_excel_html_export with io.read_html · Issue #44 · reubano/meza · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

test failure in test_excel_html_export with io.read_html #44

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
nieder opened this issue Oct 19, 2022 · 1 comment
Open

test failure in test_excel_html_export with io.read_html #44

nieder opened this issue Oct 19, 2022 · 1 comment

Comments

@nieder
< 8000 div class="ml-n3 timeline-comment unminimized-comment comment previewable-edit js-task-list-container js-comment timeline-comment--caret" data-body-version="17e30b46d8840db6cc6e3625eb11338d737c1ddd7ce2bc5bf66ad07e2405fd9a">
Copy link
nieder commented Oct 19, 2022

Testing meza-0.46.0, I get this error (py38-py310):

Test for reading an html table exported from excel ... FAIL

======================================================================
FAIL: Test for reading an html table exported from excel
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/sw/lib/python3.9/site-packages/nose/case.py", line 197, in runTest
    self.test(*self.arg)
  File "/sw/build.build/meza-py39-0.46.0-1/meza-0.46.0/tests/test_io.py", line 354, in test_excel_html_export
    nt.assert_equal(expected, next(records))
AssertionError: {'sparse_data': 'Iñtërnâtiônàližætiøn', 'so[61 chars]dam'} != {'13_width_75_some_date': '13 class=xl24 al[123 chars]dam'}
- {'some_date': '05/04/82',
+ {'13_width_75_some_date': '13 class=xl24 align=right>05/04/82',
+  '2_width_150_unicode_test': 'Ādam',
-  'some_value': '234',
+  '75_some_value': 'right>234',
?   +++              ++++++

-  'sparse_data': 'Iñtërnâtiônàližætiøn',
?                                       ^

+  '75_sparse_data': 'Iñtërnâtiônàližætiøn'}
?   +++                                    ^

-  'unicode_test': 'Ādam'}

----------------------------------------------------------------------

The output in the AssertionError line seems all mangled with the attributes from the different html table elements sprinkled in.
If I remove the html attributes for the table in data/test/test.htm, then the test passes. I notice that io.read_html uses BeautifulSoup. I have beautifulsoup-4.10.0 and soupsieve-2.3.1 installed.

@nieder
Copy link
Author
nieder commented Mar 1, 2025

With meza-0.47.0, the test is still failing, but differently maybe:

_________________________________________________ TestInput.test_excel_html_export __________________________________________________

self = <tests.test_io.TestInput object at 0x10381d720>

    def test_excel_html_export(self):  # pylint: disable=R0201
        """Test for reading an html table exported from excel"""
        filepath = p.join(io.DATA_DIR, "test.htm")
        records = io.read_html(filepath, sanitize=True, first_row_as_header=True)
    
        expected = {
            "sparse_data": "Iñtërnâtiônàližætiøn",
            "some_date": "05/04/82",
            "some_value": "234",
            "unicode_test": "Ādam",
        }
    
>       assert expected == next(records)
E       AssertionError: assert {'sparse_data': 'Iñtërnâtiônàližætiøn', 'some_date': '05/04/82', 'some_value': '234', 'unicode_test': 'Ādam'} == {'': '13 width=75>Some Date'}
E         Left contains 4 more items:
E         {'some_date': '05/04/82',
E          'some_value': '234',
E          'sparse_data': 'Iñtërnâtiônàližætiøn',
E          'unicode_test': 'Ādam'}
E         Right contains 1 more item:
E         {'': '13 width=75>Some Date'}
E         Full diff:
E           {
E         -  '': '13 width=75>Some Date',
E         +  'some_date': '05/04/82',
E         +  'some_value': '234',
E         +  'sparse_data': 'Iñtërnâtiônàližætiøn',
E         +  'unicode_test': 'Ādam',
E           }

tests/test_io.py:355: AssertionError

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant
0