Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue-6642 Add http.send request attribute to ignore headers for caching key #6675

Open
wants to merge 4 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
41 changes: 39 additions & 2 deletions topdown/http.go
Expand Up @@ -68,6 +68,7 @@ var allowedKeyNames = [...]string{
"raise_error",
"caching_mode",
"max_retry_attempts",
"cache_ignored_headers",
}

// ref: https://www.rfc-editor.org/rfc/rfc7231#section-6.1
Expand Down Expand Up @@ -168,7 +169,11 @@ func getHTTPResponse(bctx BuiltinContext, req ast.Object) (*ast.Term, error) {

bctx.Metrics.Timer(httpSendLatencyMetricKey).Start()

reqExecutor, err := newHTTPRequestExecutor(bctx, req)
key, err := getKeyFromRequest(req)
if err != nil {
return nil, err
}
reqExecutor, err := newHTTPRequestExecutor(bctx, key)
if err != nil {
return nil, err
}
Expand Down Expand Up @@ -198,6 +203,38 @@ func getHTTPResponse(bctx BuiltinContext, req ast.Object) (*ast.Term, error) {
return ast.NewTerm(resp), nil
}

// getKeyFromRequest returns a key to be used for caching HTTP responses
// deletes headers from request object mentioned in cache_ignored_headers
func getKeyFromRequest(req ast.Object) (ast.Object, error) {
var cacheIgnoredHeaders []string
var allHeaders map[string]interface{}
cacheIgnoredHeadersTerm := req.Get(ast.StringTerm("cache_ignored_headers"))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: We can do an early exit here

if cacheIgnoredHeadersTerm == nil {
    return nil, nil
}

allHeadersTerm := req.Get(ast.StringTerm("headers"))
if cacheIgnoredHeadersTerm != nil && allHeadersTerm != nil {
err := ast.As(cacheIgnoredHeadersTerm.Value, &cacheIgnoredHeaders)
if err != nil {
return nil, err
}
err = ast.As(allHeadersTerm.Value, &allHeaders)
if err != nil {
return nil, err
}
for _, header := range cacheIgnoredHeaders {
delete(allHeaders, header)
}
val, err := ast.InterfaceToValue(allHeaders)
if err != nil {
return nil, err
}
allHeadersTerm.Value = val
req.Insert(ast.StringTerm("headers"), allHeadersTerm)
}
if cacheIgnoredHeadersTerm != nil {
req.Insert(ast.StringTerm("cache_ignored_headers"), ast.NullTerm())
}
return req, nil
}

func init() {
createAllowedKeys()
createCacheableHTTPStatusCodes()
Expand Down Expand Up @@ -482,7 +519,7 @@ func createHTTPRequest(bctx BuiltinContext, obj ast.Object) (*http.Request, *htt
case "cache", "caching_mode",
"force_cache", "force_cache_duration_seconds",
"force_json_decode", "force_yaml_decode",
"raise_error", "max_retry_attempts": // no-op
"raise_error", "max_retry_attempts", "cache_ignored_headers": // no-op
default:
return nil, nil, fmt.Errorf("invalid parameter %q", key)
}
Expand Down
40 changes: 40 additions & 0 deletions topdown/http_test.go
Expand Up @@ -1012,6 +1012,46 @@ func TestHTTPSendCaching(t *testing.T) {
response: `{"x": 1}`,
expectedReqCount: 3,
},
{
note: "http.send GET different headers but still cached because ignored",
ruleTemplate: `p = x {
r1 = http.send({"method": "get", "url": "%URL%", "force_json_decode": true, "headers": {"h1": "v1", "h2": "v2"}, "cache_ignored_headers": ["h2"]})
r2 = http.send({"method": "get", "url": "%URL%", "force_json_decode": true, "headers": {"h1": "v1", "h2": "v3"}, "cache_ignored_headers": ["h2"]}) # cached
x = r1.body
}`,
response: `{"x": 1}`,
expectedReqCount: 1,
},
{
note: "http.send GET cache miss different headers (force_cache enabled)",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what scenario in the changes is this test case trying to exercise.

Copy link
Contributor Author

@rudrakhp rudrakhp May 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is to clarify how cache behaves when cache_ignored_headers has not been set. There was no such test case which tested this cache miss scenario due to different header values.

ruleTemplate: `p = x {
r1 = http.send({"method": "get", "url": "%URL%", "force_json_decode": true, "headers": {"h1": "v1", "h2": "v2"}, "force_cache": true, "force_cache_duration_seconds": 300})
r2 = http.send({"method": "get", "url": "%URL%", "force_json_decode": true, "headers": {"h1": "v1", "h2": "v3"}, "force_cache": true, "force_cache_duration_seconds": 300})
x = r1.body
}`,
response: `{"x": 1}`,
expectedReqCount: 2,
},
{
note: "http.send GET different headers but still cached because ignored (force_cache enabled)",
ruleTemplate: `p = x {
r1 = http.send({"method": "get", "url": "%URL%", "force_json_decode": true, "headers": {"h1": "v1", "h2": "v2"}, "force_cache": true, "force_cache_duration_seconds": 300, "cache_ignored_headers": ["h2"]})
r2 = http.send({"method": "get", "url": "%URL%", "force_json_decode": true, "headers": {"h1": "v1", "h2": "v3"}, "force_cache": true, "force_cache_duration_seconds": 300, "cache_ignored_headers": ["h2"]}) # cached
x = r1.body
}`,
response: `{"x": 1}`,
expectedReqCount: 1,
},
{
note: "http.send GET different cache_ignored_headers but still cached (force_cache enabled)",
ruleTemplate: `p = x {
r1 = http.send({"method": "get", "url": "%URL%", "force_json_decode": true, "headers": {"h1": "v1", "h2": "v2"}, "force_cache": true, "force_cache_duration_seconds": 300, "cache_ignored_headers": ["h2"]})
r2 = http.send({"method": "get", "url": "%URL%", "force_json_decode": true, "headers": {"h1": "v1", "h2": "v2"}, "force_cache": true, "force_cache_duration_seconds": 300, "cache_ignored_headers": ["h2", "h3"]}) # cached
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for testing we can actually have a h3 header in the headers object.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also a test for the scenario when the value of cache_ignored_headers and headers differs and we get a cache miss would be helpful

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One more case I can think of:

R1: {"headers": {"h1": "v1"}}
R2: {"headers": {"h1": "v1", "h2": "v2"}, "cache_ignored_headers": ["h2"]}

So here R1 and R2 are equivalent, correct?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another one:

R1: {"headers": {"h1": "v1"}, "cache_ignored_headers": []}
R2: {"headers": {"h1": "v1", "h2": "v2"}, "cache_ignored_headers": ["h2"]}

So here R1 and R2 are equivalent, correct?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

x = r1.body
}`,
response: `{"x": 1}`,
expectedReqCount: 1,
},
{
note: "http.send POST cache miss different body",
ruleTemplate: `p = x {
Expand Down