Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Qt 5.15: Incomplete support for sessions #5359

Open
The-Compiler opened this issue Apr 15, 2020 · 51 comments
Open

Qt 5.15: Incomplete support for sessions #5359

The-Compiler opened this issue Apr 15, 2020 · 51 comments
Labels
bug: behavior Something doesn't work as intended, but doesn't crash. priority: 0 - high Issues which are currently the primary focus. qt: 5.15 Issues related to Qt 5.15.
Projects

Comments

@The-Compiler
Copy link
Member

The-Compiler commented Apr 15, 2020

With Qt 5.15, when sessions are loaded, only about:blank is displayed. This is due to how the reverse-engineered binary format somehow changed inside Chromium...

(split off from #5237)


Update from October 2020:

  • qutebrowser v1.11.0 came with a first workaround for this issue which displays an explanation (qute://warning/sessions) and only opens the first page of a tab's history (rather than trying to load the full history and displaying about:blank).
  • With qutebrowser v1.12.0, sessions.lazy_restore is disabled with Qt 5.15 as well, so the page gets loaded rather than only loading the "suspended page" page, with no way to go back to the real page.
  • I hope to have either a rewritten session storage (this issue) or at least a rewritten lazy loading (Lazier load #4037) ready for v2.0.0, see this comment for details.
@The-Compiler
Copy link
Member Author

Reproducing in :debug-console:

import qutebrowser.misc.sessions, PyQt5.QtCore
items = [qutebrowser.misc.sessions.TabHistoryItem(active=True, original_url=PyQt5.QtCore.QUrl('file:///home/florian/proj/qutebrowser/git/tests/end2end/data/numbers/1.txt'), title='1.txt', url=PyQt5.QtCore.QUrl('file:///home/florian/proj/qutebrowser/git/tests/end2end/data/numbers/1.txt'), user_data={'zoom': 1.0, 'scroll-pos': PyQt5.QtCore.QPoint()})]
tab = objreg.get('tab', tab=0, window=0, scope='tab')
tab.history.private_api.load_items(items)

results in about:blank being loaded, logs:

16:48:26 DEBUG    webview    browsertab:_on_before_load_started:961 Going to start loading: file:///home/florian/proj/qutebrowser/git/tests/end2end/data/numbers/1.txt
16:48:26 DEBUG    webview    tabbedbrowser:_on_title_changed:750 Changing title for idx 0 to 'file:///home/florian/proj/qutebrowser/git/tests/end2end/data/numbers/1.txt'
16:48:26 DEBUG    webview    browsertab:_on_before_load_started:961 Going to start loading: file:///home/florian/proj/qutebrowser/git/tests/end2end/data/numbers/1.txt
16:48:26 DEBUG    webview    tabbedbrowser:_on_title_changed:750 Changing title for idx 0 to 'file:///home/florian/proj/qutebrowser/git/tests/end2end/data/numbers/1.txt'
[eventfilter stuff]
16:48:26 DEBUG    webview    browsertab:_set_load_status:927 load status for <qutebrowser.browser.webengine.webenginetab.WebEngineTab tab_id=0 url='about:blank'>: LoadStatus.loading
16:48:26 DEBUG    signals    signalfilter:_filter_signals:87 emitting: cur_load_status_changed(<LoadStatus.loading: 6>) (tab 0)
16:48:26 DEBUG    signals    signalfilter:_filter_signals:87 emitting: cur_load_started() (tab 0)
16:48:26 DEBUG    webview    tabbedbrowser:_on_title_changed:750 Changing title for idx 0 to '1.txt'
16:48:26 DEBUG    webview    tabbedbrowser:_on_title_changed:750 Changing title for idx 0 to 'about:blank'
16:48:26 DEBUG    webview    browsertab:_set_load_status:927 load status for <qutebrowser.browser.webengine.webenginetab.WebEngineTab tab_id=0 url='about:blank'>: LoadStatus.success
16:48:26 DEBUG    signals    signalfilter:_filter_signals:87 emitting: cur_load_status_changed(<LoadStatus.success: 2>) (tab 0)
16:48:26 DEBUG    sessions   sessions:save:305 Saving session _autosave to /home/florian/proj/qutebrowser/git/sesstest/data/sessions/_autosave.yml...
16:48:26 DEBUG    signals    signalfilter:_filter_signals:87 emitting: cur_load_finished(True) (tab 0)

@The-Compiler
Copy link
Member Author

Data we get from Qt 5.14 with tab.history.private_api.serialize():

PyQt5.QtCore.QByteArray(b"\x00\x00\x00\x04\x00\x00\x00\x01\x00\x00\x00\x00\x00\x00\x00Jfile:///home/florian/proj/qutebrowser/git/tests/end2end/data/numbers/1.txt\x00\x00\x00\x00\x00\x00\x01\x84\x80\x01\x00\x00\x1b\x00\x00\x00x\x01\x00\x00\x18\x00\x00\x00\x00\x00\x00\x00\x10\x00\x00\x00\x00\x00\x00\x00\x10\x00\x00\x00\x00\x00\x00\x00\x08\x00\x00\x00\x00\x00\x00\x00`\x00\x00\x00\x00\x00\x00\x00X\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\xf8\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x02\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\xb4\xdb\x9f\xdcU\xa3\x05\x00\xb5\xdb\x9f\xdcU\xa3\x05\x00\xe0\x00\x00\x00\x00\x00\x00\x00\xf8\x00\x00\x00\x00\x00\x00\x00\x10\x00\x00\x00\x00\x00\x00\x00\x08\x00\x00\x00\x00\x00\x00\x00\x9c\x00\x00\x00J\x00\x00\x00f\x00i\x00l\x00e\x00:\x00/\x00/\x00/\x00h\x00o\x00m\x00e\x00/\x00f\x00l\x00o\x00r\x00i\x00a\x00n\x00/\x00p\x00r\x00o\x00j\x00/\x00q\x00u\x00t\x00e\x00b\x00r\x00o\x00w\x00s\x00e\x00r\x00/\x00g\x00i\x00t\x00/\x00t\x00e\x00s\x00t\x00s\x00/\x00e\x00n\x00d\x002\x00e\x00n\x00d\x00/\x00d\x00a\x00t\x00a\x00/\x00n\x00u\x00m\x00b\x00e\x00r\x00s\x00/\x001\x00.\x00t\x00x\x00t\x00\x00\x00\x00\x00\x10\x00\x00\x00\x00\x00\x00\x00\x08\x00\x00\x00\x00\x00\x00\x00\x08\x00\x00\x00\x00\x00\x00\x00\x08\x00\x00\x00\x00\x00\x00\x00 \x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x08\x00\x00\x00\x00\x00\x00\x00\x02\x00\x00\x01\x00\xff\xff\xff\xff\x00\x00\x00\x02\x00\x00\x00Jfile:///home/florian/proj/qutebrowser/git/tests/end2end/data/numbers/1.txt\x01\x00/\x01\xec%&'\xee\x00\x00\x00\x00\xff\xff\xff\xff")

old.bin.gz

with Qt 5.15:

PyQt5.QtCore.QByteArray(b'\x00\x00\x00\x04\x00\x00\x00\x01\x00\x00\x00\x00\x00\x00\x00Jfile:///home/florian/proj/qutebrowser/git/tests/end2end/data/numbers/1.txt\x00\x00\x00\x00\x00\x00\x01\xe4\xe0\x01\x00\x00\x1c\x00\x00\x00\xd8\x01\x00\x00\x18\x00\x00\x00\x00\x00\x00\x00\x10\x00\x00\x00\x00\x00\x00\x00\x10\x00\x00\x00\x00\x00\x00\x00\x08\x00\x00\x00\x00\x00\x00\x00h\x00\x00\x00\x02\x00\x00\x00`\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x08\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x02\x00\x00\x00\x00\x01\x00\x00\x00\x00\x00\x00Z\x1dW\xe1U\xa3\x05\x00[\x1dW\xe1U\xa3\x05\x00@\x01\x00\x00\x00\x00\x00\x00X\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x10\x00\x00\x00\x00\x00\x00\x00\x08\x00\x00\x00\x00\x00\x00\x00\x9c\x00\x00\x00J\x00\x00\x00f\x00i\x00l\x00e\x00:\x00/\x00/\x00/\x00h\x00o\x00m\x00e\x00/\x00f\x00l\x00o\x00r\x00i\x00a\x00n\x00/\x00p\x00r\x00o\x00j\x00/\x00q\x00u\x00t\x00e\x00b\x00r\x00o\x00w\x00s\x00e\x00r\x00/\x00g\x00i\x00t\x00/\x00t\x00e\x00s\x00t\x00s\x00/\x00e\x00n\x00d\x002\x00e\x00n\x00d\x00/\x00d\x00a\x00t\x00a\x00/\x00n\x00u\x00m\x00b\x00e\x00r\x00s\x00/\x001\x00.\x00t\x00x\x00t\x00\x00\x00\x00\x00\x10\x00\x00\x00\x00\x00\x00\x00\x08\x00\x00\x00\x00\x00\x00\x00\x08\x00\x00\x00\x00\x00\x00\x00\x08\x00\x00\x00\x00\x00\x00\x008\x00\x00\x00\x01\x00\x00\x000\x00\x00\x00\x00\x00\x00\x008\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x10\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x10\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00 \x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x08\x00\x00\x00\x00\x00\x00\x00\x02\x00\x00\x01\x00\xff\xff\xff\xff\x00\x00\x00\x02\x00\x00\x00Jfile:///home/florian/proj/qutebrowser/git/tests/end2end/data/numbers/1.txt\x01\x00/\x01\xec)\xde\xf4X\x00\x00\x00\x00\xff\xff\xff\xff')

new.bin.gz

@The-Compiler
Copy link
Member Author

These changes to tabhistory.py fix the issue and result in a file equal to what Qt/Chromium produce, by patching in a big chunk of the data from what we got above:

diff --git qutebrowser/browser/webengine/tabhistory.py qutebrowser/browser/webengine/tabhistory.py
index f630e8873..d77013f8a 100644
--- qutebrowser/browser/webengine/tabhistory.py
+++ qutebrowser/browser/webengine/tabhistory.py
@@ -33,7 +33,8 @@ from qutebrowser.utils import qtutils
 # Qt 5.14 added version 4 which also serializes favicons:
 # https://codereview.qt-project.org/c/qt/qtwebengine/+/279407
 # However, we don't care about those, so let's keep it at 3.
-HISTORY_STREAM_VERSION = 3
+# FIXME
+HISTORY_STREAM_VERSION = 4
 
 
 def _serialize_item(item, stream):
@@ -62,16 +63,21 @@ def _serialize_item(item, stream):
 
     ## toQt(entry->GetTitle());
     # \x00\x00\x00\n\x001\x00.\x00t\x00x\x00t
-    stream.writeQString(item.title)
+    # FIXME
+    stream.writeQString("")
 
     ## QByteArray(encodedPageState.data(), encodedPageState.size());
     # \xff\xff\xff\xff
-    qtutils.serialize_stream(stream, QByteArray())
+    # qtutils.serialize_stream(stream, QByteArray())
+    state = "00 00 01 E4 E0 01 00 00 1C 00 00 00 D8 01 00 00 18 00 00 00 00 00 00 00 10 00 00 00 00 00 00 00 10 00 00 00 00 00 00 00 08 00 00 00 00 00 00 00 68 00 00 00 02 00 00 00 60 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 08 01 00 00 00 00 00 00 00 00 00 00 02 00 00 00 00 01 00 00 00 00 00 00 5A 1D 57 E1 55 A3 05 00 5B 1D 57 E1 55 A3 05 00 40 01 00 00 00 00 00 00 58 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 00 00 00 00 00 00 00 08 00 00 00 00 00 00 00 9C 00 00 00 4A 00 00 00 66 00 69 00 6C 00 65 00 3A 00 2F 00 2F 00 2F 00 68 00 6F 00 6D 00 65 00 2F 00 66 00 6C 00 6F 00 72 00 69 00 61 00 6E 00 2F 00 70 00 72 00 6F 00 6A 00 2F 00 71 00 75 00 74 00 65 00 62 00 72 00 6F 00 77 00 73 00 65 00 72 00 2F 00 67 00 69 00 74 00 2F 00 74 00 65 00 73 00 74 00 73 00 2F 00 65 00 6E 00 64 00 32 00 65 00 6E 00 64 00 2F 00 64 00 61 00 74 00 61 00 2F 00 6E 00 75 00 6D 00 62 00 65 00 72 00 73 00 2F 00 31 00 2E 00 74 00 78 00 74 00 00 00 00 00 10 00 00 00 00 00 00 00 08 00 00 00 00 00 00 00 08 00 00 00 00 00 00 00 08 00 00 00 00 00 00 00 38 00 00 00 01 00 00 00 30 00 00 00 00 00 00 00 38 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 20 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 08 00 00 00 00 00 00 00"
+    state_data = bytes.fromhex(state)
+    stream.writeRawData(state_data)
 
     ## static_cast<qint32>(entry->GetTransitionType());
     # chromium/ui/base/page_transition_types.h
-    # \x00\x00\x00\x00
-    stream.writeInt32(0)  # PAGE_TRANSITION_LINK
+    # \x02\x00\x00\x01
+    # PAGE_TRANSITION_LINK | PAGE_TRANSITION_FROM_ADDRESS_BAR
+    stream.writeInt32(0x02000000 | 1)
 
     ## entry->GetHasPostData();
     # \x00
@@ -82,9 +88,9 @@ def _serialize_item(item, stream):
     qtutils.serialize_stream(stream, QUrl())
 
     ## static_cast<qint32>(entry->GetReferrer().policy);
-    # chromium/third_party/WebKit/public/platform/WebReferrerPolicy.h
-    # \x00\x00\x00\x00
-    stream.writeInt32(0)  # WebReferrerPolicyAlways
+    # chromium/services/network/public/mojom/referrer_policy.mojom
+    # \x00\x00\x00\x02
+    stream.writeInt32(2)  # kNoReferrerWhenDowngrade
 
     ## toQt(entry->GetOriginalRequestURL());
     # \x00\x00\x00Jfile:///home/florian/proj/qutebrowser/git/tests/end2end/data/numbers/1.txt
@@ -92,7 +98,7 @@ def _serialize_item(item, stream):
 
     ## entry->GetIsOverridingUserAgent();
     # \x00
-    stream.writeBool(False)
+    stream.writeBool(True)
 
     ## static_cast<qint64>(entry->GetTimestamp().ToInternalValue());
     # \x00\x00\x00\x00^\x97$\xe7
@@ -100,7 +106,15 @@ def _serialize_item(item, stream):
 
     ## entry->GetHttpStatusCode();
     # \x00\x00\x00\xc8
-    stream.writeInt(200)
+    # FIXME
+    stream.writeInt(0)
+
+    ## favicon
+    # \xff\xff\xff\xff
+    qtutils.serialize_stream(stream, QUrl())
+
+    with open('new-fake.bin', 'wb') as f:
+        f.write(bytes(stream.device().data()))
 
 
 def serialize(items):

The main missing bit is the page state, which apparently can't just be an invalid QByteArray anymore... Argh. Not really keen on reverse-engineering that...

@The-Compiler
Copy link
Member Author

The-Compiler commented Apr 15, 2020

Analyzing the page state we got:

00 00 01 E4 E0 01 00 00 1C 00 00 00 D8 01 00 00 18 00 00 00 00 00 00 00 10 00 00 00 00 00 00 00 10 00 00 00 00 00 00 00 08 00 00 00 00 00 00 00 68 00 00 00 02 00 00 00 60 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 08 01 00 00 00 00 00 00 00 00 00 00 02 00 00 00 00 01 00 00 00 00 00 00 5A 1D 57 E1 55 A3 05 00 5B 1D 57 E1 55 A3 05 00 40 01 00 00 00 00 00 00 58 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 00 00 00 00 00 00 00 08 00 00 00 00 00 00 00 9C 00 00 00 4A 00 00 00 66 00 69 00 6C 00 65 00 3A 00 2F 00 2F 00 2F 00 68 00 6F 00 6D 00 65 00 2F 00 66 00 6C 00 6F 00 72 00 69 00 61 00 6E 00 2F 00 70 00 72 00 6F 00 6A 00 2F 00 71 00 75 00 74 00 65 00 62 00 72 00 6F 00 77 00 73 00 65 00 72 00 2F 00 67 00 69 00 74 00 2F 00 74 00 65 00 73 00 74 00 73 00 2F 00 65 00 6E 00 64 00 32 00 65 00 6E 00 64 00 2F 00 64 00 61 00 74 00 61 00 2F 00 6E 00 75 00 6D 00 62 00 65 00 72 00 73 00 2F 00 31 00 2E 00 74 00 78 00 74 00 00 00 00 00 10 00 00 00 00 00 00 00 08 00 00 00 00 00 00 00 08 00 00 00 00 00 00 00 08 00 00 00 00 00 00 00 38 00 00 00 01 00 00 00 30 00 00 00 00 00 00 00 38 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 20 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 08 00 00 00 00 00 00 00

  • Total size: 488 bytes
  • 00 00 01 e4 is 484 bytes, interpreted as >I (oddball because big-endian, maybe QByteArray serialization in QDataStream?)
  • e0 01 00 00 is 480 bytes, interpreted as <I (base::Pickle header
  • 1c 00 00 00 is version 28, interpreted as <i(page_state_serialization.cc)
  • D8 01 00 00 18 00 00 00 00 00 00 00 10 00 00 00 00 00 00 00 10 00 00 00 00 00 00 00 08 00 00 00 00 00 00 00 68 00 00 00 02 00 00 00 60 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 08 01 00 00 00 00 00 00 00 00 00 00 02 00 00 00 00 01 00 00 00 00 00 00 5A 1D 57 E1 55 A3 05 00 5B 1D 57 E1 55 A3 05 00 40 01 00 00 00 00 00 00 58 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 00 00 00 00 00 00 00 08 00 00 00 00 00 00 00 9C 00 00 00: unknown
  • 4A 00 00 00: size of URL string (74 chars == 148 bytes)
  • 66 00 69 00 6C 00 65 00 3A 00 2F 00 2F 00 2F 00 68 00 6F 00 6D 00 65 00 2F 00 66 00 6C 00 6F 00 72 00 69 00 61 00 6E 00 2F 00 70 00 72 00 6F 00 6A 00 2F 00 71 00 75 00 74 00 65 00 62 00 72 00 6F 00 77 00 73 00 65 00 72 00 2F 00 67 00 69 00 74 00 2F 00 74 00 65 00 73 00 74 00 73 00 2F 00 65 00 6E 00 64 00 32 00 65 00 6E 00 64 00 2F 00 64 00 61 00 74 00 61 00 2F 00 6E 00 75 00 6D 00 62 00 65 00 72 00 73 00 2F 00 31 00 2E 00 74 00 78 00 74 00: URL string
  • 00 00 00 00 10 00 00 00 00 00 00 00 08 00 00 00 00 00 00 00 08 00 00 00 00 00 00 00 08 00 00 00 00 00 00 00 38 00 00 00 01 00 00 00 30 00 00 00 00 00 00 00 38 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 20 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 08 00 00 00 00 00 00 00: unknown

@The-Compiler
Copy link
Member Author

The-Compiler commented Apr 15, 2020

Note that the source says:

// NOTE: If the version is -1, then the pickle contains only a URL string.
// See ReadPageState.

which can be seen here:

void ReadPageState(SerializeObject* obj, ExplodedPageState* state) {
  obj->version = ReadInteger(obj);

  if (obj->version == -1) {
    GURL url = ReadGURL(obj);
    // NOTE: GURL::possibly_invalid_spec() always returns valid UTF-8.
    state->top.url_string = base::UTF8ToUTF16(url.possibly_invalid_spec());
    return;
  }

  // ...

So I hoped to be able to replicate this by doing:

import struct
url = 'file:///home/florian/proj/qutebrowser/git/tests/end2end/data/numbers/1.txt'
encoded = url.encode('utf-16')[2:]
parts = []
parts.append(encoded)
parts.append(struct.pack('<I', len(url)))
parts.append(struct.pack('<i', -1))
parts.append(struct.pack('<I', len(b''.join(parts))))
parts.append(struct.pack('>I', len(b''.join(parts))))
state_data = b''.join(reversed(parts))

with open('new-fake-minus.bin', 'wb') as f:
    f.write(state_data)

stream.writeRawData(state_data)

Which results in:

00 00 00 A0 9C 00 00 00 FF FF FF FF 4A 00 00 00 66 00 69 00 6C 00 65 00 3A 00 2F 00 2F 00 2F 00 68 00 6F 00 6D 00 65 00 2F 00 66 00 6C 00 6F 00 72 00 69 00 61 00 6E 00 2F 00 70 00 72 00 6F 00 6A 00 2F 00 71 00 75 00 74 00 65 00 62 00 72 00 6F 00 77 00 73 00 65 00 72 00 2F 00 67 00 69 00 74 00 2F 00 74 00 65 00 73 00 74 00 73 00 2F 00 65 00 6E 00 64 00 32 00 65 00 6E 00 64 00 2F 00 64 00 61 00 74 00 61 00 2F 00 6E 00 75 00 6D 00 62 00 65 00 72 00 73 00 2F 00 31 00 2E 00 74 00 78 00 74 00

  • Total size: 164 bytes
  • 00 00 00 A0: 160 bytes as >I
  • 9C 00 00 00: 156 bytes as <I
  • FF FF FF FF: version -1, interpreted as <i
  • size of URL string and URL string, as above.

Which doesn't look too bad, and kind of works - however, qutebrowser now gets stuck while loading the page instead...

@The-Compiler The-Compiler added bug: behavior Something doesn't work as intended, but doesn't crash. priority: 0 - high Issues which are currently the primary focus. qt: 5.15 Issues related to Qt 5.15. labels Apr 15, 2020
@Kingdread
Copy link
Contributor

I did some digging, and I have a bad feeling about this.

In src/core/web_contents_adapter.cpp we have the code for deserializing the history:

static void deserializeNavigationHistory(QDataStream &input, int *currentIndex, std::vector<std::unique_ptr<content::NavigationEntry>> *entries, content::BrowserContext *browserContext)
{
    [...]
    for (int i = 0; i < count; ++i) {
        [...]
        input >> virtualUrl;
        input >> title;
        input >> pageState;
        input >> transitionType;
        input >> hasPostData;
        input >> referrerUrl;
        input >> referrerPolicy;
        input >> originalRequestUrl;
        input >> isOverridingUserAgent;
        input >> timestamp;
        input >> httpStatusCode;
        [...]
        entry->SetPageState(content::PageState::CreateFromEncodedData(std::string(pageState.data(), pageState.size())));
        [...]
}

The serialization looks mostly like what qutebrowser is doing (with the patch from above), the interesting line is the SetPageState, which in turn calls content::PageState::CreateFromEncodedData (part of Chromium):

// static
PageState PageState::CreateFromEncodedData(const std::string& data) {
  return PageState(data);
}

[...]
PageState::PageState(const std::string& data)
    : data_(data) {
  // TODO(darin): Enable this DCHECK once tests have been fixed up to not pass
  // bogus encoded data to CreateFromEncodedData.
  //DCHECK(IsValid());
}

My guess is that the encoded page state that we pass to Qt never actually makes it through ReadPageState and all the version checks, and it's just assumed that it's valid.

@The-Compiler
Copy link
Member Author

Indeed - but with Qt < 5.15 we got away with just passing no data at all for the page state. That doesn't work anymore. In the comment above, I tried to craft a page state which is as minimal as possible based on the code (using a -1 version number and only serializing a URL). That caused QtWebEngine/Chromium to show the correct URL (rather than about:blank), but it somehow was stuck loading infinitely.

So either:

  1. I did some mistake in the data above (will need to double-check with a debug build and gdb)
  2. The -1 "version" isn't actually supported properly anymore, and we do need to pass a more "proper" page state.

Right now I can see three paths forward:

  1. Properly reverse-engineer the current pagestate data format (would also allow us to save things like POST data and form inputs in sessions, which is kinda nice!)
  2. Check the earliest format we can get (e.g. with QtWebEngine 5.7 or even 5.4) - hopefully something close to version 11, which is the oldest supported format and should be much simpler - then reverse-engineer that
  3. Store the page state as base64 in the session files and restore it; and if opening older session files with Qt 5.15, just open the newest page and lose the tab history

@Kingdread
Copy link
Contributor

Looks like I might have not dug enough, deep down there are some calls to DecodePageState which in turn uses ReadPageState. I assumed that PageState was the proper object that we need, not ExplodedPageState. So it should work after all. I'm playing around with it but can't get it to work "more properly" than what you already have right now.

I guess option 2 would be a nice compromise between the effort to imitate Chromium's serialization and the capability to store sessions. Option 3 also means no transferring tab history between backends then (if anyone ever did that, so probably not a huge loss).

@The-Compiler
Copy link
Member Author

Here's a quick dumper:

import sys

from PyQt5.QtWebEngineWidgets import QWebEngineView
from PyQt5.QtWidgets import QApplication
from PyQt5.QtCore import QByteArray, QDataStream, QIODevice, QUrl


def on_load_finished():
    data = QByteArray()
    stream = QDataStream(data, QIODevice.ReadWrite)
    assert stream.status() == QDataStream.Ok
    stream << view.history()
    assert stream.status() == QDataStream.Ok
    stream.device().seek(0)

    print(f'raw data: {bytes(data).hex()}\n\n')

    version = stream.readInt()
    print(f"version: {version}")

    count = stream.readInt()
    print(f"count: {count}")

    current = stream.readInt()
    print(f"current index: {current}")

    for i in range(count):
        print(f"\n---- entry {i} ----")
        url = QUrl()
        stream >> url
        print(f"GetVirtualURL: {url}")

        title = stream.readString()
        print(f"title: {title}")

        pagestate = QByteArray()
        stream >> pagestate
        print(f"pagestate: {bytes(pagestate).hex()}")

        transition = stream.readInt32()
        print(f"transition: {hex(transition)}")

        has_post_data = stream.readBool()
        print(f"has post data: {has_post_data}")

        referrer = QUrl()
        stream >> referrer
        print(f"referrer: {referrer}")

        referrer_policy = stream.readInt32()
        print(f"referrer policy: {referrer_policy}")

        original_request_url = QUrl()
        stream >> original_request_url
        print(f"original request url: {original_request_url}")

        is_overriding_user_agent = stream.readBool()
        print(f"is overriding user agent: {is_overriding_user_agent}")

        time = stream.readInt64()
        print(f"time: {time}")

        http_status = stream.readInt()
        print(f"http status: {http_status}")

        if version >= 4:
            favicon_url = QUrl()
            stream >> favicon_url
            print(f"favicon url: {favicon_url}")

        assert stream.atEnd()

    app.quit()


app = QApplication([])
view = QWebEngineView()
# view.show()
view.loadFinished.connect(on_load_finished)
view.load(QUrl.fromUserInput(sys.argv[1]))
app.exec_()

With PyQt 5.7, we get: 00010000170000000000000094000000660069006c0065003a002f002f002f0068006f006d0065002f0066006c006f007200690061006e002f00700072006f006a002f007100750074006500620072006f0077007300650072002f006700690074002f00740065007300740073002f0065006e006400320065006e0064002f0064006100740061002f006e0075006d0062006500720073002f0031002e00740078007400ffffffff0000000000000000ffffffff00000000080000000000000000000000a5211cfd67a30500a6211cfd67a3050001000000080000000000000000000000080000000000000000000000000000000000000000000000ffffffff00000000

So that's version 0x17 == 23 of the data. It is a bit simpler, but not as much as I hoped (260 bytes instead of 484 bytes for the file:// URL). Haven't checked whether it actually loads with Qt 5.15 yet.

@The-Compiler
Copy link
Member Author

Option 3 also means no transferring tab history between backends then (if anyone ever did that, so probably not a huge loss).

Not necessarily. We can still store the history as-is, the problem is that we won't be able to load it with Qt 5.15 and QtWebEngine (though I haven't checked what actually happens to the loaded pages with Qt 5.15 - maybe the history is intact, just the newest page is broken?)

@Kingdread
Copy link
Contributor

I've tried your version 23 dump, and I got a "URL cannot be found" message (which makes sense, given that it's hardcoded to your path). I tried modifying it to my own path here (modifying the dump and adjusting the length fields), and even wrote a version 23 serializer for the data, but the original issue persists - upon loading the session, the page does not get properly re-loaded.

Maybe another option is to trigger a re-load of the page when the session is loaded, lazily once per tab)?

For reference, slightly annotated version 23 dumper:

import struct
url = item.url.toDisplayString()
encoded = url.encode('utf-16')[2:]
parts = []
# Version
parts.append(struct.pack('<i', 23))
# Referenced file array
parts.append(struct.pack('<I', 0))
# URL string
parts.append(struct.pack('<I', len(encoded)))
parts.append(encoded)
# Target
parts.append(struct.pack('<i', -1))
# Scroll offsets
parts.append(struct.pack('<I', 0))
parts.append(struct.pack('<I', 0))
# Referrer
parts.append(struct.pack('<i', -1))
# Document state
parts.append(struct.pack('<I', 0))
# Page scale factor (saved as length of double + double data)
parts.append(struct.pack('<I', 8))
parts.append(struct.pack('<d', 0.0))
# Item and document sequence number
parts.append(struct.pack('<Q', 0))
parts.append(struct.pack('<Q', 0))
# Referrer policy
parts.append(struct.pack('<I', 1))
# Visual viewport scroll offset (same as scale factor)
parts.append(struct.pack('<I', 8))
parts.append(struct.pack('<d', 0.0))
parts.append(struct.pack('<I', 8))
parts.append(struct.pack('<d', 0.0))
# Scroll restoration type
parts.append(struct.pack('<I', 0))
# Has state object
parts.append(struct.pack('<I', 0))
# False for HTTP body
parts.append(struct.pack('<I', 0))
# HTTP content type
parts.append(struct.pack('<i', -1))
# Subitems
parts.append(struct.pack('<I', 0))

parts.insert(0, struct.pack('<I', len(b''.join(parts))))

state_data = b''.join(parts)

with open('new-fake-minus.bin', 'wb') as f:
    f.write(state_data)

qtutils.serialize_stream(stream, QByteArray(state_data))

@The-Compiler
Copy link
Member Author

Meh... But if you dump your own version 28 dump (or rewrite mine above) and load that, that works? If so, there's probably still hope we can get some minimal valid data for version 28 (or find out what part of the data causes it to load infinitely vs. not do so)

@Kingdread
Copy link
Contributor

If I dump my own version 28 dump it loads correctly (to the URL that I dumped it with), but I didn't manage yet to make a modified one with a custom URL. After version 26 the serialization switched to some Mojo-based serialization, which shuffles some things around and I can't understand the format yet.

@The-Compiler
Copy link
Member Author

Argh, that mojo stuff sounds like a pain. Probably need to find a way out which doesn't rely on reverse-engineering the format then...

@The-Compiler
Copy link
Member Author

I've been thinking about this some more, and I think it's unrealistic to reverse-engineer the format. Even if we managed to do so, there are various problems:

  • It's likely going to change often, causing more work with every Qt upgrade.
  • There's a lot of data where we don't know a real answer to - so we would have to guess a lot of things, without knowing what the benefit of doing so are.

So there aren't many options left, with all of them being a dead end, as far as we know:

  • Get version -1 to work (maybe by forcing a reload or something - but what about older entries in the history?)
  • Get version 23 to work, but with stubbed data (kinda like what you did above)

Instead, I'd propose we use this as an opportunity to introduce a new session format, which saves/restores the page state 1:1. While we're at it, we can also switch from YAML to JSON for session files, to (partially) move away from it - that's something I wanted to do anyways due to performance/maintenance issues, see #2777.

What that would mean:

  • We keep .yml session files as "legacy format" supported for a while, but for every tab, we only read the most recently opened page. The rest of the information is discarded, but the .yml file still stays around (TODO: Find out if people really need the other information to persist)
  • Instead of getting the current information (zoom, scroll position, etc.) from a variety of sources, we interpret the existing QDataStream and get the information from there. We will still need to store some additional information (e.g. whether a tab is pinned), most likely.
  • Instead of setting that information in a variety of ways, we reconstruct and deserialize the QDataStream
  • All the information which is directly in the QDataStream (even things we didn't save so far, like the HTTP status code) will land in the JSON, so we can have a 1:1 reconstruction.
  • To make the JSON a bit more minimal, we can probably assume some default values for most fields, so that we don't need to store them if they are equal to the default value
  • For the page state, we save an additional <sessionname>-pagestate.bin binary file, treating it as an opaque blob (could also have it as base64 inside the JSON, but it might get big with form inputs and whatnot).

Benefits:

  • Instead of faking some data, we get an 1:1 representation of Qt/Chromium's internal serialization back - seems liek a great idea, the current "assuming some default values" strategy worked (until now) but sure leaves a bad aftertaste. With the variety of information in there (referrer state and what not), it's really hard to judge the (also security) impact of the current approach.
  • We probably can store a more true representation of the page, with things like form input, POST data, etc. etc.
  • Less maintenance cost
  • No duplicate storing of information Chromium/Qt stores already (scroll position, zoom?), and no (duplicate) restoring "by hand"
  • Getting away from PyYAML

Downsides:

  • When the page state is unavailable (e.g. when switching from QtWebKit), we only have a "best effort" approach of loading the newest URL only.
  • It's harder to write/edit session files by hand (but there's little we can do about that, at least with Qt 5.15)
  • Everyone with scripts interpreting the YAML sessions will need to adjust them to interpret the .json instead - but it's still possible to get URL and title from it, as those as stored outside of the page state by Qt (TODO: title seems empty though?!).

Open questions:

  • How to make this interoperable with QtWebKit? Is the data similar enough? If switching from QtWebKit -> QtWebEngine (at least with 5.15), we'd likely have to fall back on just loading the newest URL again, because we're missing the page state.

@toofar
Copy link
Member

toofar commented Apr 17, 2020

TODO: Find out if people really need the other information to persist

I would prefer it but if we could load a session on 5.14 and save it in the new format then under 5.15 it would all be loaded from the pagestate that would be good.

@Kingdread
Copy link
Contributor

Kingdread commented Apr 17, 2020

Another curiosity: I've played around with version 23 a bit more and noticed that it does work, but only for https sites. I could plug in https://github.com and https://google.com and other sites, and it worked fine and loaded the page. But if I tried the non-SSL versions, it never finished loading on its own - even when using those exact sites, and with the rest of the code being identical.

In general though I agree that we shouldn't chase Chromium's internal serialization. They've said themselves that they don't want to stabilize the binary format, and there seems to be barely any documentation.

@The-Compiler
Copy link
Member Author

I would prefer it but if we could load a session on 5.14 and save it in the new format then under 5.15 it would all be loaded from the pagestate that would be good.

That would mean keeping (at least part of) the current hacky serialization-faking code around, at least for a while. I'll see how I feel about that when actually implementing the rest, but I guess that's feasible. I'd probably ignore the less critical information (scroll position and zoom) at least.

@The-Compiler
Copy link
Member Author

Something I forgot when saying this above:

  • For the page state, we save an additional <sessionname>-pagestate.bin binary file, treating it as an opaque blob (could also have it as base64 inside the JSON, but it might get big with form inputs and whatnot).

is that a pagestate exists for every history entry, not only once.

So I can see three ways to store it:

  1. base64 in JSON (perhaps gzipped first)
  2. As a <sessionname>-pagestate folder, with some <UUID>.bin files referenced in the JSON
  3. Store the entire session as something with better binary support, e.g. sqlite. However, session data isn't exactly suited well for SQL.

I'm probably going to go with 1., though it'd be really nice to know how big the data can get for more complex pages/navigations (perhaps with form inputs, file uploads, etc.). @toofar @Kingdread have you looked at the size of the page state data with real website requests/history so far?

@toofar
Copy link
Member

toofar commented Apr 22, 2020

have you looked at the size of the page state data with real website requests/history so far?

Nope, I haven't looked into this at all apart from following this issue. Is there a way we can dump them form the debug console?

Regarding storing them, we could use the zipfile module too? As I understand it we would only want to read/write all the history items for a tab at once so putting them in separate files sounds less than ideal. I suppose it comes back to the size of them whether you go with base64 json or something binary or zipped because if they are large the base64 one will be a bit more IO.

@Kingdread
Copy link
Contributor

I don't have the exact numbers available at the moment, but they were also a few hundred bytes in size (like the one you dumped). I didn't use very "complex" sites though, so not a lot of inputs, frames, ... and I didn't have a long history - not sure how all of that affects the total size in the end.

If binary serialization would be acceptable (i.e. no human-readable JSON/YAML), what about other formats for serializing binary data? Something like BSON, MessagePack, Protobuf, ... It might add less overhead than base64-in-json, but the files would be even less readable than JSON-with-binary-blobs. I guess it'd be worthwile checking how much data would actually need to be saved with a realistic history, and how much overhead the "simple" solution would produce.

@The-Compiler
Copy link
Member Author

The-Compiler commented Apr 23, 2020

Nope, I haven't looked into this at all apart from following this issue. Is there a way we can dump them form the debug console?

Something like this maybe:

from qutebrowser.utils import qtutils
tab = objreg.get('tab', scope='tab', tab=0, window=0)
hist = tab.history._history
data = qtutils.serialize(hist)
print(len(data), len(hist))

That's the entire data though, not just the page state. Here it's some 7kb in total for a page with two history entries - so when only looking at the pagestate, that's probably a couple 100 bytes, which I guess is fine to have in base64. I'm mainly wondering if it can grow into kilo-/megabytes for a single entry.

Unfortunately, I think it can: Here on GitHub, I now get 115 kB total, increasing as I'm typing this message...

Regarding storing them, we could use the zipfile module too? As I understand it we would only want to read/write all the history items for a tab at once so putting them in separate files sounds less than ideal. I suppose it comes back to the size of them whether you go with base64 json or something binary or zipped because if they are large the base64 one will be a bit more IO.

I like that idea. That'd mean a mywork.json and a mywork-pagestate.zip for every session, which I guess is manageable.

If binary serialization would be acceptable (i.e. no human-readable JSON/YAML), what about other formats for serializing binary data? Something like BSON, MessagePack, Protobuf, ... It might add less overhead than base64-in-json, but the files would be even less readable than JSON-with-binary-blobs.

Might as well just use the QDataStream then, since the data already is in that format to begin with 😉

sqlite would have a couple of advantages: No new dependencies (same for QDataStream), and still easily read-/writeable with a tool most people likely already have installed (the sqlite commandline tool). Also it's something we're already using elsewhere in qutebrowser.

Right now I'm leaning towards the zipfile idea. Still means two files for every session, but that seems okay to me.

@Kingdread
Copy link
Contributor

Kingdread commented Apr 23, 2020

Right, I forgot about the built-in Qt serialization. To me, the zipfile approach sounds like a good middle ground - binary data for the page states, compression so save space, not producing a lot of single files with random names and easier cleanup/"GC" code, so +1 for the zipfile idea.

Edit: Theoretically, we could even pack the JSON into the .zip, that way we have a single, self-contained file with all of the benefits.

@The-Compiler
Copy link
Member Author

Edit: Theoretically, we could even pack the JSON into the .zip, that way we have a single, self-contained file with all of the benefits.

Good point. I'm mainly worried about people who use session files as a "hack" to get a list of all currently open tabs/windows - IIRC, some people use something like that to have a "window switcher" which includes qutebrowser tabs and emacs buffers and such. With this change, they'll need to adjust their code anyways - and I guess some additional unzip -c mywork.zip session.json | ... won't hurt too much.

I'm pretty convinced that's the way to go forward, so unless there are any objections, I'm going to try implementing that as soon as I find the time.

The-Compiler added a commit that referenced this issue Jan 27, 2021
Also don't show it for new users - this doesn't really help much if
someone just started using qutebrowser.

See #5359
@The-Compiler
Copy link
Member Author

This has been dragging along way longer than I'd like, but I'm afraid I'll need to delay this once again. I'd like to get v2.0.0 into the next Debian Stable, but for that to happen it needs to be released ASAP.

I've played with the minimal required format in #5359 (comment) and #5359 (comment) again in the hope that something changed with Qt 5.15.2 (and the enclosed Chromium upgrade), but unfortunately that wasn't the case. It looks like there's really no way around having a new session format... 😞

Unfortunately the new session format is a major undertaking and also impacts some old PRs which I'd like to look at first, and I just haven't found the time to do this so far. It's still quite high on my roadmap though, and I hope to finally get back soon after v2.0.0 is released.

@The-Compiler The-Compiler modified the milestones: v2.0.0, v2.1.0 Jan 27, 2021
The-Compiler added a commit that referenced this issue Feb 16, 2021
This means sessions need to be initialized after websettings, because
initializing websettings also initializes QtWebEngine and thus
qutescheme. This needs to happen before sessions.init() calls
version.webengine_versions(). I don't think this should be a problem, as
they are independent to each other.

Fixes #5738
See #5359
Also switches sessions.init() to pathlib, see #176.
@The-Compiler
Copy link
Member Author

I feel bad about having to push this back yet again, but it's time for a v2.1 since various fixes were needed for newer QtWebEngine/PyQt versions. I still think it'd be bad to hold back new releases because this hasn't been fixed yet.

I hope with QtWebEngine 5.15.3 being the last 5.15 release with a Chromium update (rather than just backporting security fixes), Qt/PyQt should stop throwing me curve-balls needing time for a while (until Qt 6.2 is ready of course).

@The-Compiler The-Compiler removed this from the v2.1.0 milestone Mar 12, 2021
The-Compiler added a commit that referenced this issue Mar 12, 2021
toofar added a commit to toofar/qutebrowser that referenced this issue Aug 18, 2021
With Qt 5.15 the underlying chromium version switched to using a new
page serialization format. Additionally the deserialization is much more
fragile and we don't have enough of an API from webengine to pull out
enough of the necessary page attributes to construct something that we
can deserialize into a new page that works. So now we attempt to dump
the whole page state along with the session. This should be backwards
compatible so if you save a session with this version of qutebrowser on
5.14 and then load it on 5.15 you should still get your session history.

I have no idea how fragile the parsing is.

TODO:
cleanup
abstractisize (make work for webkit)
test on other versions and document weirdnesses
add version number to session file and save backups on loading older ones?

ref qutebrowser#5359
@The-Compiler The-Compiler unpinned this issue Sep 20, 2021
@cosminadrianpopescu
Copy link
Contributor

cosminadrianpopescu commented Jan 9, 2022

I've tested the commit 8891ce9 and it works fine for web engine. I haven't tested it for webkit.

@cosminadrianpopescu
Copy link
Contributor

So, I've been using this commit (8891ce9) since a few days and it looks promissing.

The only downside seems to be that it makes the session file quite large. I have a session with 25 tabs opened and it is already almost 400K. I don't know if this is going to become an issue. It seems that with every history item the session file grows.

@The-Compiler
Copy link
Member Author

Given that some people already see quite severe performance issues with PyYAML (#2777) and that saving binary data as base64 in it has a bit of overhead too, I think that will turn into a problem indeed...

That's why above I proposed to turn sessions into (potentially uncompressed) zip files, so that we can store the binary data as binary, and in the long run also do stuff like storing multiple historical versions of a session. Unfortunately, I never got around to actually implementing it so far...

@toofar
Copy link
Member

toofar commented Jan 11, 2022

As a data point (I agree doing it like this is not a good idea for widespread use, not everyone has my luxury in hardware) my session file with the page state in base64 is 5.3M (68k lines) and it usually takes 0.5 - 0.7 of a second to save to an SSD (and with the C yaml extensions compiled of course!). I've moved the yaml dumping to a thread, I'm not sure if that helps but I don't experience any pauses from it.

@cosminadrianpopescu
Copy link
Contributor

I've moved the yaml dumping to a thread, I'm not sure if that helps but I don't experience any pauses from it.

Is there a commit for this?

@toofar
Copy link
Member

toofar commented Jan 14, 2022

@cosminadrianpopescu
nope, it's just in a merge commit. I guess I might add it to my branch but tbh I assumed it didn't make a difference. I was having a performance issue but I think I discovered it was something else (I don't remember what though so maybe I confused myself). So in lieu of going back an checking that it is worthwhile here is some copypasta (yeah, it is this trivial), the line numbers might be a bit off:

patch? (click to expand)


diff --git i/qutebrowser/misc/sessions.py w/qutebrowser/misc/sessions.py
index 888252dc7162..f5112822fcdb 100644
--- i/qutebrowser/misc/sessions.py
+++ w/qutebrowser/misc/sessions.py
@@ -303,6 +306,8 @@ class SessionManager(QObject):
         path = self._get_session_path(name)
 
         log.sessions.debug("Saving session {} to {}...".format(name, path))
+        import time
+        s = time.time()
         if last_window:
             data = self._last_window_session
             if data is None:
@@ -313,11 +318,20 @@ class SessionManager(QObject):
                                   with_private=with_private)
         log.sessions.vdebug(  # type: ignore[attr-defined]
             "Saving data: {}".format(data))
-        try:
-            with qtutils.savefile_open(path) as f:
-                utils.yaml_dump(data, f)
-        except (OSError, UnicodeEncodeError, yaml.YAMLError) as e:
-            raise SessionError(e)
+        if not data:
+            return name
+        import threading
+        t=threading.Thread(target=yaml_dump_thread, args=(path, data))
+        t.start()
+
+        a = time.time()
+        log.sessions.info(f"session '{name}' save took {a-s}s")
 
         if load_next_time:
             configfiles.state['general']['session'] = name

@@ -493,6 +522,20 @@ class SessionManager(QObject):
         return sorted(sessions)
 
 
+def yaml_dump_thread(path, data):
+    from qutebrowser.utils import utils
+    import time
+    try:
+        s = time.time()
+        #with qtutils.savefile_open(path) as f:
+        with open(path, 'w') as f:
+            utils.yaml_dump(data, f)
+        a = time.time()
+        log.sessions.info(f"session '{path}' save took {a-s}s")
+    except (OSError, UnicodeEncodeError, yaml.YAMLError) as e:
+        message.error(str(e))
+
+
 @cmdutils.register()
 @cmdutils.argument('name', completion=miscmodels.session)
 def session_load(name: str, *,

@Mrestof
Copy link

Mrestof commented Jul 31, 2022

Wow, I couldn't imagine this issue was such a big thing. Is there some workaround rn to have your tabs history restored after loading the session? As I understand, the binary files for each session (in format <sessionname>-pagestate.bin) were not introduced, because for each session I have only one yaml file.

@toofar
Copy link
Member

toofar commented Aug 2, 2022

@Mrestof nope, no workaround sorry. There has been surprisingly little pushback against the breakage so it kinda dropped down the priority list. Go give the top post a 👍 while you are here.

@Architector4
Copy link

Architector4 commented Dec 4, 2022

Would it be possible to introduce an incomplete version of session.lazy_restore as a cheap workaround? The session file stores the names and links of each of the tabs, and for me personally it would be way more helpful to have Qutebrowser just load facade minimal pages instead of real ones when loading the session, and just have it redirect to the link the tab had when focused on via simple Javascript, disregarding the history or anything else session data may have.

Even if admittedly very incomplete, that sounds very easy to implement, would not need saving anything new, or doing much really besides interpreting the already existing session file a bit differently (switching up the links to the facade page) and making the facade page, and would be enough for some tab hoarders like me. Of course it would need to be under a separate setting now that some people have session.lazy_restore still set and have forgotten about it, and would not appreciate their full session data get messed up, but yeah lol

@The-Compiler
Copy link
Member Author

@Architector4 Probably! The current way lazy restore works always has been quite a hack. I suppose it would be possible for it to just open the given URL in JS instead of going back in the history, yeah.

I don't think it needs to be a separate setting. Restoring the full session data (including history) is broken by the workaround currently anyways...

@lypanov
Copy link

lypanov commented Jan 13, 2024

Save/restore is bad enough here (just with quit, not even kill) locally that I've been forced to go back to my old browser alas. I have lost too many tabs / too many sessions and can no longer rely on it for my work. Shame as I really enjoy the qutebrowser UI.

@VehementHam
Copy link

Maybe there is a way to hack around this. Like tabs are stored in a file, and then that file is referenced when a window is re-opened.

@VehementHam
Copy link

VehementHam commented Apr 20, 2024

Seems like it could be done in the config, if you create a custom keybind. open keybinds would also echo the URL to a file, and then there would be a startup command that would open all the URLs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug: behavior Something doesn't work as intended, but doesn't crash. priority: 0 - high Issues which are currently the primary focus. qt: 5.15 Issues related to Qt 5.15.
Projects
Roadmap
  
Focus
Development

No branches or pull requests

10 participants