Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Priority for piece validation #2082

Open
ThaUnknown opened this issue May 21, 2021 · 18 comments
Open

Priority for piece validation #2082

ThaUnknown opened this issue May 21, 2021 · 18 comments

Comments

@ThaUnknown
Copy link
Member

ThaUnknown commented May 21, 2021

What version of this package are you using?
v0.118.0

What problem do you want to solve?
Whenever adding a torrent with an existing store, its pieces need to be re-validated, this happens sequentially, meaning you need to wait for validation if you want to use a specific piece. This becomes an issue for formats like matroska where the video's metadata is at the very end of the file.

What do you think is the correct solution to this problem?
Create a priority for torrent pieces to be validated based on incoming request such as file.createReadStream(), this could potentially allow for users to read files before waiting for a potentially massive file to finish validation and maybe even start downloading missing pieces before validation is finished? This might not be possible tho.

Are you willing to submit a pull request to implement this change?
No, no idea how to solve this issue.

@SilentBot1
Copy link
Member

SilentBot1 commented May 22, 2021

Hi @ThaUnknown,

I can understand the use for this feature, but with how the _verifyPeices function is currently implemented, this is not currently possible and would require a re-write of _verifyPieces. As this is a valid and possibly useful feature request, I'll leave this open to see if anybody would like to / be able to help with this.

As a workaround, in some cases if it's safe to assume that the store hasn't been modified, the torrent opts.skipVerify option can be passed when re-adding a torrent to a client to prevent any file verification, which would allow the fast startup you're seeking, though depending on the store you're using, e.g. IndexedDB, it's not always safe to assume that localStorage hasn't been modified.

Additionally, if the content is already downloaded and in a local storage, would it be possible to have these added and loaded to the client to prior to the attempt to load the content? - this would also help keep the swarm healthy with active webseeds, though depending on how many torrents the client is caching in your application, this may be infeasible.

As a side note, depending on the content and how the MKV is encoded, you can also attempt to render this in a WEBM tag. I've had luck doing this in the past with some sources working due to WEBM being a subset of MKV, though features like subtitles had to be extracted separately, converted to WebVTT and then re-added to the content, preventing the need for the full file to be downloaded before playback starts.

If you want to have a further chat regarding this, feel free to continue the conversation here, or alternatively you can reach out to me via the email on my GitHub profile.

Many thanks,
Brad Marsden

@ThaUnknown
Copy link
Member Author

ThaUnknown commented May 22, 2021

@SilentBot1
Since I created this request I've looked at possible implementations of this feature, and it wouldn't actually require a re-write, just a priority queue similarly to how piece fetching priority functions.

9/10 times skipVerify is the wrong solution since if a piece hasn't actually been downloaded WT is gonna assume it has been, which would break pretty much any video player out there, especially if this was metadata

? I have no idea what you're trying to say here.

indeed, webm is a subset of matroska, it would have the same exact issue as I described above, additionally an external subtitle file would also have the exact same issue as I described above since 9/10 times it would be ordered after the video file. webm is vastly inferior to matroska which every1 knows, and which is why you won't find many torrents using that format. Converting pretty much any subtitle format to VTT is lossy, and doesn't work at all [yes I've tried]. Having an external subtitle file wouldn't solve the issue I described either, the video metadata would still be at the end of the file which is how the format was specced,
even mp4 would have this issue:
image
I only used mkv as a VERY VERY widespread example in the torrenting scene

@ThaUnknown
Copy link
Member Author

@DiegoRBaquero

@DiegoRBaquero
Copy link
Member

Without keep state of the downloaded pieces (the bitfield) then it's very hard to do anything without verifying, although we could prioritize verifying, why not allow this state to be persisted and read?

@ThaUnknown
Copy link
Member Author

Without keep state of the downloaded pieces (the bitfield) then it's very hard to do anything without verifying, although we could prioritize verifying, why not allow this state to be persisted and read?

mainly external modification I guess? someone could modify or delete the file, so it needs to be re-scanned every time it's re-added as a torrent

@github-actions
Copy link

github-actions bot commented Aug 3, 2021

Is this still relevant? If so, what is blocking it? Is there anything you can do to help move it forward?

@github-actions github-actions bot added the stale label Aug 3, 2021
@ThaUnknown
Copy link
Member Author

Is this still relevant? If so, what is blocking it? Is there anything you can do to help move it forward?

noooo, not stale

@alxhotel alxhotel removed the stale label Aug 3, 2021
@github-actions
Copy link

github-actions bot commented Oct 3, 2021

Is this still relevant? If so, what is blocking it? Is there anything you can do to help move it forward?

@github-actions github-actions bot added the stale label Oct 3, 2021
@ThaUnknown
Copy link
Member Author

Is this still relevant? If so, what is blocking it? Is there anything you can do to help move it forward?

bump

@ThaUnknown ThaUnknown removed the stale label Oct 3, 2021
@github-actions
Copy link

github-actions bot commented Dec 3, 2021

Is this still relevant? If so, what is blocking it? Is there anything you can do to help move it forward?

@github-actions github-actions bot added the stale label Dec 3, 2021
@ThaUnknown
Copy link
Member Author

tes

@gauravsaini
Copy link

gauravsaini commented Dec 2, 2022

const { PriorityQueue } = require('p-queue')

// Create a priority queue instance with a custom comparison function that
// sorts elements based on their "priority" property
const queue = new PriorityQueue((a, b) => a.priority - b.priority)

function _verifyPieces () {
  if (this.destroyed) return

  let done = true
  for (let index = 0; index < this.pieces.length; index++) {
    const piece = this.pieces[index]
    if (piece.verified || piece.verifying) continue

    // If we are at the max number of concurrent verifications, skip this piece
    if (this._numVerifying >= this._verifyConcurrency) {
      done = false
      continue
    }

    // Add the piece to the queue with a priority based on the time when the
    // piece was requested for verification
    queue.add({ piece, priority: Date.now() })
  }

  // Continue verifying pieces from the queue until we reach the max concurrency
  while (this._numVerifying < this._verifyConcurrency) {
    // Dequeue the next piece from the priority queue
    const { piece } = queue.dequeue()

    // If the piece is already verified or currently being verified, skip it
    if (piece.verified || piece.verifying) continue

    // Mark the piece as being verified and verify it
    piece.verifying = true
    this._numVerifying++
    this._verify(piece)
  }

  if (done) this.emit('done')
}

@gauravsaini
Copy link

gauravsaini commented Dec 2, 2022

`
// Change this
function _onVerifyPiece (piece) {
this.destroyed || this._verifyPending++

this.store.get(piece.index, { offset: 0, length: piece.length }, (err, buffer) => {
if (this.destroyed) return
if (err) return this._destroy(err)
if (this.destroyed) return

// Remove the piece from the queue, as it has been verified
queue.remove(item => item.piece === piece)

// Verify the piece
const sha1 = this.torrent.pieces[piece.index]
if (sha1.equals(sha1_digest(buffer))) {
  debug('piece verified %s', piece.index)
  this.pieces[piece.index] = null
  this._reservations[piece.index] = null
  this.bitfield.set(piece.index, true)

  // Update the verified length for the torrent
  this._calculateVerifiedLength()

  // Check if we have all the verified pieces and can signal 'done'
  this._checkDone()
} else {
  this.emit('warning', new Error('Piece ' + piece.index + ' failed verification'))
}

this._verifyPending--

// Continue verifying pieces from the queue until the max concurrency is reached
for (let i = 0; i < queue.length && this._numVerifying < this.maxConns; i++) {
  const next = queue[i]
  this._verify(next.piece)
}

this._checkDone()

}
}

`

@ThaUnknown
Copy link
Member Author

not quite, this doesn't work with torrent.select, which can happen in the middle of the store validation, also this won't work with the select changes planned for v2

@dminkovsky
Copy link

What are the select changes for v2? Can I read about them somewhere?

@dminkovsky
Copy link

I am seeing #2260

@ThaUnknown
Copy link
Member Author

I am seeing #2260

#2115

@dminkovsky
Copy link

Thanks @ThaUnknown. The select()/deselect() approach looks nice there!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants