Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

filter seen doesn't work if fields changed #3975

Open
trim21 opened this issue Apr 24, 2024 · 7 comments
Open

filter seen doesn't work if fields changed #3975

trim21 opened this issue Apr 24, 2024 · 7 comments

Comments

@trim21
Copy link
Contributor

trim21 commented Apr 24, 2024

Expected behaviour:

seen plugin is expected to reject fetched task.

And I find that, if you update config, it may never remember any task

for example, I have a rss with dynamic torrent url,

<enclosure url="http://127.0.0.1:8745/torrent?q={...}" length="1024" type="application/x-bittorrent"/>

and {...} may change, but title and guid never change.

So I could have a config look like this:

tasks:
  test2:
    limit:
      amount: 20
      from:
        rss: http://127.0.0.11:8745/rss?q=1
    accept_all: true

    seen:
      fields:
        - original_url
#        - title
        - guid
    transmission: ...

But

If I change seen.fields, the seen plugin just become no-op.

for example, from this (config 1):

    seen:
      fields:
        - title

to this (config 2):

    seen:
      fields:
        - guid

and run it multiple times, it will always pipe this task to seen_info_hash, seen plugin never reject this task.

Actual behaviour:

seen plugin should reject save

Steps to reproduce:

  • use config 1
  • execute task
  • use config 2
  • execute task again
  • execute task again (you should saw task rejected by seen plugin, not seen_info_hash)

Log:

(click to expand)
paste log output here

Additional information:

  • FlexGet version: current develop dee678c
  • Python version:
  • Installation method:
  • Using daemon (yes/no):
  • OS and version:
  • Link to crash log:
import random
from pathlib import Path
from typing import Annotated

import fastapi
from fastapi import Query
from loguru import logger
from starlette.responses import Response


app = fastapi.FastAPI(debug=True)


@app.get("/rss")
def generate_rss():
    token = random.randbytes(12).hex()
    rss = f"""
 <rss xmlns:dc="http://purl.org/dc/elements/1.1/" version="2.0">
    <channel>
        <title>TT tt tt</title>
        <link>https://example.com</link>
        <description>desc</description>
        <dc:creator>tt</dc:creator>
        <item>
            <title>example title</title>
            <link>https://example.com/777581</link>
            <description>
                <![CDATA[ hi ]]></description>
            <enclosure
                    url="http://127.0.0.1:8745/torrent?q={token}"
            <pubDate>Wed, 24 Apr 2024 14:30:42 GMT</pubDate>
            <comments>https://example.com</comments>
            <guid isPermaLink="false">63a7820e3abee02347b07d8a0473db7ee49af2d1</guid>
            <dc:creator>N/A</dc:creator>
            <dc:date>2024-04-24T14:30:42Z</dc:date>
        </item>
    </channel>
</rss>
    """

    logger.info("generate rss with torrent token query q={}", token)

    return Response(content=rss.encode(), media_type="application/xml")


@app.get("/torrent")
def torrent_download(q: Annotated[str, Query()]):
    logger.info("torrent downloaded q={}", q)
    raise ValueError("please provide a valid torrent here")
    return Response(content=Path(...).read_bytes())


if __name__ == "__main__":
    import uvicorn

    uvicorn.run(app, port=8745)
@trim21
Copy link
Contributor Author

trim21 commented Apr 24, 2024

and this can also be fixed by flexget seen forget --task test2 '*'

@BrutuZ
Copy link
Contributor

BrutuZ commented Apr 24, 2024

Entries end up rejected by seen_info_task because it runs before seen

$ flexget plugin | grep "seen"
| seen               | task               | filter(255),       | doc, builtin |
| seen_info_hash     | task               | filter(180),       | doc, builtin |
| seen_movies        | task               | filter(-255),      | doc          |

You could either override the plugin priority or disable seen_info_task altogether

EDIT: Turns out I misremembered it, priority goes High to Low, not the other way around.

@trim21
Copy link
Contributor Author

trim21 commented Apr 24, 2024

no, it doesnt working.

I use same config1 in issue description, and use this as config 2, torrents are still downloaded, still rejected by seen_info_hash

    seen:
      priority: 10
      fields:
        - original_url
        - title
#        - guid

@trim21
Copy link
Contributor Author

trim21 commented Apr 24, 2024

No it doesn't work indeed...

If I execute task with non seen config, then edit config with seen config with fields, seen plugin doesn't work.

    plugin_priority:
      seen: 170
    seen:
      fields:
        - original_url
        - guid

I add logger to these 2 plugin

2024-04-25 00:02:04 VERBOSE  task_queue                    There are 1 tasks to execute. Shutdown will commence when they have completed.
2024-04-25 00:02:04 VERBOSE  rss           test2           Bozo error <class 'xml.sax._exceptions.SAXParseException'> while parsing feed, but entries were produced, ignoring the error.
2024-04-25 00:02:04 VERBOSE  details       test2           Produced 1 entries.
2024-04-25 00:02:04 INFO     seen          test2           handle task 'example title'
2024-04-25 00:02:04 INFO     seen          test2           handle task 'example title'
2024-04-25 00:02:04 INFO     seen          test2           handle task 'example title'
2024-04-25 00:02:04 VERBOSE  task          test2           ACCEPTED: `example title` by accept_all plugin
2024-04-25 00:02:04 INFO     download      test2           Downloading: example title
2024-04-25 00:02:04 VERBOSE  details       test2           Summary - Accepted: 1 (Rejected: 0 Undecided: 0 Failed: 0)
2024-04-25 00:02:04 INFO     seen          test2           handle task 'example title'
2024-04-25 00:02:04 INFO     seen          test2           handle task 'example title'
2024-04-25 00:02:04 INFO     remember_rej  test2           Remembering rejection of `example title`
2024-04-25 00:02:04 VERBOSE  task          test2           REJECTED: `example title` by seen_info_hash plugin because entry with torrent_info_hash `D3CB4E9FBC394993E6EF11F16287F8C2B39E75F5` is already marked seen in the task test2 at 2024-04-24 23:56

@trim21
Copy link
Contributor Author

trim21 commented Apr 24, 2024

NOTE: this doesn't happened in clean state. You must run task without seen config first, then edit config with seen fields to reproduce this bug.

@gazpachoking
Copy link
Member

You want to disable seen_info_hash? You can disable built-in plugins with the disable plugin.

disable:
  - seen_info_hash

Or, you could explicitly configure it as off:

seen_info_hash: no

Or have I misinterpreted the issue?

@trim21
Copy link
Contributor Author

trim21 commented Apr 25, 2024

You want to disable seen_info_hash? You can disable built-in plugins with the disable plugin.

disable:
  - seen_info_hash

Or, you could explicitly configure it as off:

seen_info_hash: no

Or have I misinterpreted the issue?

seen plugin doesn't reject task as expected

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants