Skip to content

Commit

Permalink
Avoid Duplicate Titles When Reading Multiple HTML Files (#1829)
Browse files Browse the repository at this point in the history
This issue arose while researching issue #1823. The issue was not a bug;
it just required clarification to the author of how to use the software.
But, while researching, I discovered that loading html into 2
sheets of a spreadsheet has a problem if the html title tag is the same
for the 2 sheets. PhpSpreadsheet would be able to save the resulting file,
but Excel would not be able to read it properly because of the duplicate title.
The worksheet setTitle method allows for disambiguation is such a circumstance.
The html reader passed a parameter indicating "don't disambiguate", but I can't
see any harm in changing that to "disambiguate". An extremely simple fix,
with tests to back it up.
  • Loading branch information
oleibman committed Feb 27, 2021
1 parent 25f7dcb commit cb23cca
Show file tree
Hide file tree
Showing 2 changed files with 30 additions and 1 deletion.
2 changes: 1 addition & 1 deletion src/PhpSpreadsheet/Reader/Html.php
Expand Up @@ -320,7 +320,7 @@ private function processDomElementTitle(Worksheet $sheet, int &$row, string &$co
{
if ($child->nodeName === 'title') {
$this->processDomElement($child, $sheet, $row, $column, $cellContent);
$sheet->setTitle($cellContent, true, false);
$sheet->setTitle($cellContent, true, true);
$cellContent = '';
} else {
$this->processDomElementSpanEtc($sheet, $row, $column, $cellContent, $child, $attributeArray);
Expand Down
29 changes: 29 additions & 0 deletions tests/PhpSpreadsheetTests/Reader/Html/HtmlLoadStringTest.php
Expand Up @@ -89,4 +89,33 @@ public function testCanLoadFromStringIntoExistingSpreadsheet(): void
$spreadsheet = $reader->loadFromString($html, $spreadsheet);
self::assertEquals(2, $spreadsheet->getSheetCount());
}

public function testCanLoadDuplicateTitle(): void
{
$html = <<<'EOF'
<html>
<head>
<title>Sheet</title>
</head>
<body>
<table><tr><td>1</td></tr></table>
</body>
</html>
EOF;
$reader = new \PhpOffice\PhpSpreadsheet\Reader\Html();
$spreadsheet = $reader->loadFromString($html);
$reader->setSheetIndex(1);
$reader->loadFromString($html, $spreadsheet);
$reader->setSheetIndex(2);
$reader->loadFromString($html, $spreadsheet);
$sheet = $spreadsheet->getSheet(0);
self::assertEquals(1, $sheet->getCell('A1')->getValue());
self::assertEquals('Sheet', $sheet->getTitle());
$sheet = $spreadsheet->getSheet(1);
self::assertEquals(1, $sheet->getCell('A1')->getValue());
self::assertEquals('Sheet 1', $sheet->getTitle());
$sheet = $spreadsheet->getSheet(2);
self::assertEquals(1, $sheet->getCell('A1')->getValue());
self::assertEquals('Sheet 2', $sheet->getTitle());
}
}

0 comments on commit cb23cca

Please sign in to comment.