Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Garbled contents when reloading file with DOS line endings #3303

Open
scurest opened this issue May 19, 2024 · 8 comments
Open

Garbled contents when reloading file with DOS line endings #3303

scurest opened this issue May 19, 2024 · 8 comments
Labels

Comments

@scurest
Copy link

scurest commented May 19, 2024

Description of the problem

When reloading a DOS-mode file (CRLF line endings) that has changed on disk, the file contents are garbled.

Steps to reproduce

To help repro, here's a Python script that writes a random test.txt with DOS line endings.

# a.py
import random, sys
random.seed(sys.argv[1])
with open('test.txt', 'w') as f:
    for _ in range(6):
        f.write(str(random.randint(0,999)))
        f.write("\r\n")

I run python a.py 1 and then open the test file with micro test.txt.

Then I run python a.py 2 to change the file. Micro prompts me to reload: The file on disk has changed. Reload file? (y,n,esc). I hit y. I get

Note that this is not a display issue, the actual contents are garbled. You can keep repeating this, python a.py 3 and so on.

Note that this does NOT happen if a.py is changed to write \n line endings AND if micro initially opened the file in unix mode.

Specifications

Commit hash: 68d88b5
OS: Arch Linux
Terminal: xterm 390-2

@dmaluka dmaluka added the major label May 19, 2024
@Gavin-Holt
Copy link

Gavin-Holt commented May 19, 2024

Hi

I noted this problem a while back (#2756), but could not make it reproducible.

At that time I assumed it was due to my inferior OS!

We should be careful not to confuse reopen and reload these are different commands:

  • reopen : undocumented command to reopen current buffer using bp.Buf:ReOpen() function
  • reload : this command reloads all runtime files (bindings.json, settings.json, init.lua, ?plugins?)

Confusingly, settings.json includes an option named reload:

  • "controls the reload behavior of the current buffer in case the file has changed. The available options are prompt, auto & disabled" which calls bp.Buf:ReOpen() !

I would be interested in a fix, so my spellchecking function (#2744) correctly reloads altered files.

Kind Regards Gavin Holt

@dmaluka
Copy link
Collaborator

dmaluka commented May 19, 2024

FWIW I can easily reproduce it, with the newest micro from master.

  1. open a file with DOS line endings (\r\n)
  2. open the same file in another editor (or even in another instance of micro), add any text somewhere in the middle of the file, and save it
  3. reload the file (i.e. answer "yes" to the reload prompt) in the original instance of micro => result: the new text is inserted in the wrong place.

It's clearly a bug, and hopefully easy to fix (just need to find free time to do that).

We should be careful not to confuse reopen and reload these are different commands:

Yeah, they are doing completely different things.

  • reopen : undocumented command to reopen current buffer using bp.Buf:ReOpen() function

Indeed, and it might be a good moment to document it.

  • reload : this command reloads all runtime files (bindings.json, settings.json, init.lua, ?plugins?)

Yep, it reloads plugins too.
We are considering improving its documentation: #3240 (review)

Confusingly, settings.json includes an option named reload:

  • "controls the reload behavior of the current buffer in case the file has changed. The available options are prompt, auto & disabled" which calls bp.Buf:ReOpen() !

The reopen command does the same thing that is done automatically if the reload option is set to prompt or auto, i.e. reloads the file. So it's not surprising that ReOpen() is called in both cases.

Yeah, it might be confusing that the reload command (not option) does a completely different thing.

@scurest
Copy link
Author

scurest commented May 19, 2024

Thanks for the pointer to reopen. So it looks like the error occurs in EventHandler.ApplyDiff. Here's a fragment of the diff operations performed

State of LineArray
  "569"
  "489"
  "45"

Diff to Process
  {"Equal" "9\r\n"} @ {X:2 Y:0}

State of LineArray
  "569"
  "489"
  "45"

Diff to Process
  {"Delete" "48"} @ {X:1 Y:1}      <- NOTE: should be {X:0 Y:1}

State of LineArray
  "569"
  "4"           <- NOTE: 89 deleted instead of 48
  "45"

So the error seems to be that when \r\n is encountered, the location advances two positions to the right, instead of only one. This misaligns the locations for all subsequent diff operations, which produces the garbled text.

@JoeKar
Copy link
Collaborator

JoeKar commented May 22, 2024

So the error seems to be that when \r\n is encountered, the location advances two positions to the right, instead of only one.

Thank you for the great hint!

if !isMark(r) {

->

if r != '\r' && !isMark(r) {

Most probably there is a more elegant and especially generic way, since this was just shot out of my hip. 😉

@scurest:
Does the proposal work for you too?

@scurest
Copy link
Author

scurest commented May 26, 2024

ApplyDiff is also wrong in the presence of marks.

echo -e 'Te\u0302st' > test.txt  # COMBINING CIRCUMFLEX ACCENT
# Open in micro, shows 'Têst'
echo -e 'Test' > test.txt
# Reload in micro, still shows 'Têst'

@Gavin-Holt
Copy link

Hi,

ApplyDiff sounds complicated, and not strictly necessary for reopen.

Could reopen just read the whole disc copy and reposition the cursor to the last known editing location?

Kind Regards Gavin Holt

PS. I would also expect reopen to clear the undo stack.

@scurest
Copy link
Author

scurest commented May 29, 2024

ApplyDiff maintains the "logical" cursor position.

echo 'The bottle says DRINK ME' > test.txt
# Open in micro, 'The bottle says DRINK |ME' (| denotes cursor location)
echo 'The cake says EAT ME' > test.txt
# Reload in micro, 'The cake says EAT |ME' (cursor maintained at ME)

Is this the only reason it's used? I would also be fine giving this up for correct reopen.

Also the line-ending mode is not re-detected when reopening.

@JoeKar
Copy link
Collaborator

JoeKar commented May 30, 2024

Is this the only reason it's used?

No. The main purpose is to handle every insertion and/or deletion independent to track it inside the undo buffer. Furthermore the modifications can be visualized with the diffgutter.
The cursor position is handled here:

b.RelocateCursors()

So bypassing or removing ApplyDiff is no option so far. It just need to work correctly in these scenarios.

Currently you broke it down to some minimal reproduction cases, which is good so far.
Just image the use case where you've a much larger file which is edited outside of you current buffer...maybe you would like to see the diff and maybe you'd like to undo something.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants