Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JA3 uniqueness in modern version of Chrome (which randomizes ClientHello messages) #88

Open
mikeage opened this issue Feb 14, 2023 · 14 comments

Comments

@mikeage
Copy link

mikeage commented Feb 14, 2023

https://chromestatus.com/feature/5124606246518784 discusses a new change, designed to prevent ossification, in which client hellos will be randomized, subject to the limits in the RFC.

It was originally scheduled to go into Chrome 110, but was actually merged in Chrome 109.

Other fingerprinting implementations such as https://tlsfingerprint.io have started to sort these headers to restore some consistency; see https://tlsfingerprint.io/norm_fp .

Does JA3 plan to address this?

@cbeuw
Copy link

cbeuw commented Apr 22, 2023

Can confirm that JA3 is no longer unique for Chrome, a spec update is needed

[
    {
        "destination_ip": "140.82.121.4",
        "destination_port": 443,
        "ja3": "771,4865-4866-4867-49195-49199-49196-49200-52393-52392-49171-49172-156-157-47-53,10-5-45-0-17513-13-18-11-23-16-35-27-65281-43-51-21,29-23-24,0",
        "ja3_digest": "2740d6edbeb9712c511e4f0a87d993aa",
        "source_ip": "192.168.0.101",
        "source_port": 65143,
        "timestamp": 1682117797.691859
    },
    {
        "destination_ip": "142.250.203.106",
        "destination_port": 443,
        "ja3": "771,4865-4866-4867-49195-49199-49196-49200-52393-52392-49171-49172-156-157-47-53,0-5-27-23-18-35-10-51-13-45-17513-16-11-43-65281-21,29-23-24,0",
        "ja3_digest": "947089a91c967944d4fb21f0bdf9552e",
        "source_ip": "192.168.0.101",
        "source_port": 65144,
        "timestamp": 1682117797.744229
    },
    {
        "destination_ip": "185.199.109.154",
        "destination_port": 443,
        "ja3": "771,4865-4866-4867-49195-49199-49196-49200-52393-52392-49171-49172-156-157-47-53,5-0-11-65281-35-10-27-16-43-17513-23-51-13-18-45-21,29-23-24,0",
        "ja3_digest": "da44a5b652630311ee5eb56623356a2d",
        "source_ip": "192.168.0.101",
        "source_port": 65145,
        "timestamp": 1682117797.747927
    },
    {
        "destination_ip": "185.199.111.133",
        "destination_port": 443,
        "ja3": "771,4865-4866-4867-49195-49199-49196-49200-52393-52392-49171-49172-156-157-47-53,17513-18-51-11-13-10-27-43-23-0-65281-5-45-16-35-21,29-23-24,0",
        "ja3_digest": "26b1105c45572c61f6c25fdbe724e420",
        "source_ip": "192.168.0.101",
        "source_port": 65146,
        "timestamp": 1682117797.757486
    },
    {
        "destination_ip": "185.199.109.154",
        "destination_port": 443,
        "ja3": "771,4865-4866-4867-49195-49199-49196-49200-52393-52392-49171-49172-156-157-47-53,45-5-18-0-16-35-17513-43-65281-27-11-10-51-23-13-21,29-23-24,0",
        "ja3_digest": "4db0be7bf0688b54185adfdf70b0167f",
        "source_ip": "192.168.0.101",
        "source_port": 65147,
        "timestamp": 1682117797.758512
    },
    {
        "destination_ip": "140.82.112.22",
        "destination_port": 443,
        "ja3": "771,4865-4866-4867-49195-49199-49196-49200-52393-52392-49171-49172-156-157-47-53,5-13-18-0-51-23-43-10-45-11-17513-27-16-65281-35-21,29-23-24,0",
        "ja3_digest": "984b725baea6bfeee32063e5fffaf93c",
        "source_ip": "192.168.0.101",
        "source_port": 65148,
        "timestamp": 1682117798.115487
    },
    {
        "destination_ip": "140.82.121.5",
        "destination_port": 443,
        "ja3": "771,4865-4866-4867-49195-49199-49196-49200-52393-52392-49171-49172-156-157-47-53,10-17513-5-35-0-65281-13-23-45-16-51-18-27-11-43-21,29-23-24,0",
        "ja3_digest": "8ed066e87ae52cd04becca471a0b0892",
        "source_ip": "192.168.0.101",
        "source_port": 65149,
        "timestamp": 1682117798.199789
    }
]

@ghost
Copy link

ghost commented Apr 22, 2023

I dont understand what the concern is here, can someone clarify? it seems Chrome is randomizing the client hello, on purpose. they dont want a server seeing a JA3 and saying "thats Chrome". what is the issue?

@cbeuw
Copy link

cbeuw commented Apr 22, 2023

The goal of JA3 is to fingerprint browsers, one should be able to look at a JA3 and say "that's Chrome". The fact that it can't fingerprint Chrome any more is a deficiency, and one that isn't technically hard to solve.

You may say that one shouldn't be able to fingerprint Chrome, and indeed that may be what the Chrome team wants, but artificially withholding an offensive tool isn't going to improve privacy and security standards in the industry. Software like metasploit and sqlmap continuously improve their offensive capability while more and more defences are added to their targets, the tools don't stop sharpening just because the target doesn't want to be exploited, because an actual malicious party won't care what the target wants.

@ghost
Copy link

ghost commented Apr 22, 2023

The goal of JA3 is to fingerprint browsers

no, the goal is to fingerprint TLS hellos. hopefully you can understand that a web browser is not the only type of client. you also have stuff like cURL, Go net/http, and many, many others.

one should be able to look at a JA3 and say "that's Chrome". The fact that it can't fingerprint Chrome any more is a deficiency, and one that isn't technically hard to solve.

thats your opinion. JA3 in my view is merely a specification to fingerprint a TLS hello, client and server. nothing Chrome has done has changed that. You can still use the specification here and example implementation to fingerprint hellos. the usefulness of that fingerprint seems to be beyond the scope of the specification.

You may say that one shouldn't be able to fingerprint Chrome, and indeed that may be what the Chrome team wants, but artificially withholding an offensive tool isn't going to improve privacy and security standards in the industry.

seems like it already is doing that. if a server is whitelisting Chrome, and Chrome now randomizes the TLS extensions, then the server is forced to remove the whitelist, or come up with some other solution. sounds like a win to me, if only a temporary one.

Software like metasploit and sqlmap continuously improve their offensive capability while more and more defences are added to their targets, the tools don't stop sharpening just because the target doesn't want to be exploited, because an actual malicious party won't care what the target wants.

this seems to be way off topic. In my view JA3 is fine as is. if an improvement needs to be made, its in the SSLVersion field, which is currently underspecified with respect to TLS 1.3, not anything Google Chrome is doing.

@cbeuw
Copy link

cbeuw commented Apr 22, 2023

I mean yes, you can still calculate a JA3 of a ClientHello from Chrome. But the word "fingerprint" implies that it should be stable within an large enough environmental envelope, such that it can be used to tell one type of client from another. If all you want is a number representing one specific ClientHello message, why not just SHA256 the whole message? JA3 no longer serves any meaningful purpose when one wants to identify extension-randomised TLS clients, so it's not fine.

some other solution

All you need to do is to sort the Extension field in JA3. I already implemented for my own purposes so I can just modify the spec and call it JA4 🤷

@ghost
Copy link

ghost commented Apr 22, 2023

But the word "fingerprint" implies that it should be stable within an large enough environmental envelope, such that it can be used to tell one client from another.

even with these changes, you can compare two fingerprints, and determine if they are the same or different. if historically, a client previously sent the same hello, that is a quirk of history, nothing more. previously some crypto events would take different amounts of time, leading to exploits:

https://wikipedia.org/wiki/Timing_attack

since then, crypto packages implement certain functions in constant time, which has blocked this type of attack. the fact that TLS client hellos were previously reused was just a weakness in TLS implementations, which allow attackers (such as yourself) to gain information about a client based on a TLS hello.

If all you want is a number representing one specific ClientHello message, why not just SHA256 the whole message?

JA3 gives a human readable representation of a TLS hello. if you just hash the bytes, then you lose that.

JA3 no longer serves any meaningful purpose when one wants to identify extension-randomised TLS clients, so it's not fine.

Thats fine by me. I think it was a mistake that it was ever used for that purpose. You can still compare fingerprints, but now without further action, you can only determine if the fingerprints are the same or different. you can also still determine which fingerprints are more popular, but most likely Chrome will fall off that list in favor of weaker clients still sending fixed TLS hellos.

All you need to do is to sort the Extension field in JA3. I already implemented for my own purposes so I can just modify the spec and call it JA4 🤷

OK then, if you wish to implement a new spec, you are welcome to do so.

@mikeage
Copy link
Author

mikeage commented Apr 22, 2023

I can't speak for anyone else, but as the one who opened this issue, my thoughts are as follows:

Fingerprints are expected to be consistent for a given client instance. They may also be consistent across many different instances of the same client, in which case they can identify the client, but they should not change if the client remains the exact same instance.

Furthermore, any specification should have an expected use case. Saying "the usefulness of that fingerprint seems to be beyond the scope of the specification" is not really a good argument -- one can argue that trying to whitelist browsers, especially versions, is a bad idea (it is -- that's why Client Hints were invented to replace User-Agents), but a spec that doesn't offer a meaningful use case is worthless. Reading the original blog post at https://engineering.salesforce.com/tls-fingerprinting-with-ja3-and-ja3s-247362855967/, it seems that the intention is that a particular client should be consistent, and this is no longer true. (then again, given that the spec has been released publicly, it's fairly trivial for any form of malware to simply change its handshake to either mimic an existing client (impossible to reverse engineer from the hash, but trivial from a compliant server), so perhaps it's no longer valid once it's been revealed).

The reason we considered using it is that JA3 is unique among fingerprinting implementation in that Amazon CloudFront offers it natively (see, for example, https://aws.amazon.com/about-aws/whats-new/2022/11/amazon-cloudfront-supports-ja3-fingerprint-headers/). With zero extra compute cost (and negligible time), any request sent through CF can have a JA3 fingerprint added as a header.

@ghost
Copy link

ghost commented Apr 22, 2023

Fingerprints are expected to be consistent for a given client instance. They may also be consistent across many different instances of the same client, in which case they can identify the client, but they should not change if the client remains the exact same instance.

This has never been true. the fact that people relied on this to be true in the past was an assumption based on previous behavior, that could have changed at any time. TLS does not specify that the extensions need to be sent in any order, or read in any order. The fact that this assumption was abused by some in this thread is just evidence of a weakness in some TLS implementations, one that is now being addressed by some clients.

Furthermore, any specification should have an expected use case. Saying "the usefulness of that fingerprint seems to be beyond the scope of the specification" is not really a good argument

JA3 has a use already, human readable representation of TLS hellos. further, with the fingerprint specification, one can compare hellos as well. this hasnt changed.

a spec that doesn't offer a meaningful use case is worthless.

agree, but that isn't the situation here. I went over the uses already in previous comments on this page.

Reading the original blog post at https://engineering.salesforce.com/tls-fingerprinting-with-ja3-and-ja3s-247362855967/, it seems that the intention is that a particular client should be consistent, and this is no longer true.

this wasnt true in 2018:

When multiple extensions of different types are present, the extensions MAY appear in any order

https://datatracker.ietf.org/doc/html/rfc8446#section-4.2

or 2008:

When multiple extensions of different types are present in the ClientHello or ServerHello messages, the extensions MAY appear in any order

https://datatracker.ietf.org/doc/html/rfc5246#section-7.4.1.4

the fact that some clients previously used the same order, does not mean that they will continue to do that, or that its required, or that it can be relied on that they will do that.

(then again, given that the spec has been released publicly, it's fairly trivial for any form of malware to simply change its handshake to either mimic an existing client (impossible to reverse engineer from the hash, but trivial from a compliant server), so perhaps it's no longer valid once it's been revealed).

Exactly. people seem to forget that if a server can abuse it to spy on clients, a client can too. any browser can be MITM to capture its hello, and same for Android as well. so its possible to mimic any client hello in order to bypass server restriction.

@mikeage
Copy link
Author

mikeage commented Apr 22, 2023

I agree completely that the spec allows it. This is not under debate.

(edit: when I wrote "it seems that the intention is that a particular client should be consistent, and this is no longer true", I was referring to the intention of JA3, not the tls hello)

According to your view of the purpose of JA3, what use is it at all? Why would I want to compare TLS handshakes, and assuming I want to, why would I want a human readable version? The blog mentions identifying malware, which is only meaningful if the malware client has a unique and consistent fingerprint. It does in their examples, but only because the malware clients don't randomize the order. I believe we both agree that a modern hacker would bypass this. Who do you envision using JA3 and for what practical purpose?

(FWIW, my goal was to see if we can find people who use, say, python requests to mimic a request that we expect to come only from a real browser. It's part of a set of behavioral patterns that attempt to identify impersonation and replays. Timing analysis is another part of it, along with quite a few other parameters.)

@ghost
Copy link

ghost commented Apr 22, 2023

(edit: when I wrote "it seems that the intention is that a particular client should be consistent, and this is no longer true", I was referring to the intention of JA3, not the tls hello)

those who wrote the JA3 spec might have assumed that clients would send extensions in the same order, however this has never been required, since 2008 when TLS 1.2 spec was released.

According to your view of the purpose of JA3, what use is it at all? Why would I want to compare TLS handshakes, and assuming I want to, why would I want a human readable version?

why would you want a human readable representation of any binary format? I suppose Google should also toss their human readable ProtoBuf implementation:

https://google.golang.org/protobuf/encoding/prototext

you would want to compare TLS hellos for the same reason as always, to glean information about the client sending the hello. The fact that some clients are now hardened is just a consequence of time. as servers have decided to start spying on users via TLS, the natural response is for clients to take steps to anonymize themselves against this type of spying.

The blog mentions identifying malware, which is only meaningful if the malware client has a unique and consistent fingerprint.

what does malware have to do with Google Chrome?

It does in their examples, but only because the malware clients don't randomize the order. I believe we both agree that a modern hacker would bypass this. Who do you envision using JA3 and for what practical purpose?

anyone who desires to have a human readable representation of a TLS client hello, or who wishes to compare fingerprints. users will need to keep in mind as has always been the case, that clients can fake the hello, or harden it using randomization. this has always been possible, for at least 15 years.

(FWIW, my goal was to see if we can find people who use, say, python requests to mimic a request that we expect to come only from a real browser. It's part of a set of behavioral patterns that attempt to identify impersonation and replays. Timing analysis is another part of it, along with quite a few other parameters.)

yes, people already do this. Not sure about Python, but a Go module is already available for this purpose:

https://github.com/refraction-networking/utls

@mikeage
Copy link
Author

mikeage commented Apr 22, 2023

I'm familiar with utls, but mostly because I learned about it from the tlsfingerprint.io site linked in the original issue! They chose to offer fingerprints with the hello sorted, in order to get back to getting meaningful data from Chrome. A reference to their discussions with the Chrome team is at https://groups.google.com/a/chromium.org/g/blink-dev/c/zdmNs2rTyVI/m/OlV6ILBOBwAJ .

In any case, I disagree with your suggestion that a fingerprint of binary data is useful for comparison; besides for standardizing the length (ok, it's useful in that sense), it offers nothing more readable than the original. This is unlike protobuf's textual implementation, or assembly vs machine language, or WASM's text vs binary representation, which are helpful since they're readable and meaningful.

That said, I doubt we're likely to reach agreement here. I think each of our positions stands on its own merits, and on the assumption that you're not involved with JA3, I don't really see any value in us continuing this debate.

@ghost
Copy link

ghost commented Apr 22, 2023

They chose to offer fingerprints with the hello sorted, in order to get back to getting meaningful data from Chrome.

Yes, I can see why you might want this. However it seems you have ignorance on JA3, so let me add some detail here. When I am talking about a text representation, I am talking about JA3 itself, so something like this:

769,47-53-5-10-49161-49162-49171-49172-50-56-19-4,0-10-11,23-24-25,0

implementations then take this textual data and MD5 it, to return a fingerprint:

de350869b8c85de67a350c8d186f11e6

so any previous reference to textual representation on this page is about the proper JA3, not a fingerprint. As you said, a fingerprint is no more useful than the binary format in regards to understanding the underlying structure.

In any case, I disagree with your suggestion that a fingerprint of binary data is useful for comparison; besides for standardizing the length (ok, it's useful in that sense), it offers nothing more readable than the original.

see above.

This is unlike protobuf's textual implementation, or assembly vs machine language, or WASM's text vs binary representation, which are helpful since they're readable and meaningful.

see above.

That said, I doubt we're likely to reach agreement here. I think each of our positions stands on its own merits, and on the assumption that you're not involved with JA3, I don't really see any value in us continuing this debate.

Yes its important that other sides be provided here, otherwise users can railroad the issue without lending a critical eye to the potential drawbacks of a change. With all due respect, I think I have a place at this table, seeing how I have extensively studied this issue, and have successfully captured and mimicked client requests, including browser and Android itself. I have also used these methods to bypass TLS fingerprinting attempts by Google servers [1] and CloudFlare as well.

Another point to consider. If you want to change the fingerprint, you would need to change the underlying JA3 as well. If you change the underlying JA3, then you lose the original order of the TLS hello. This could break roundtrip encoding in some cases, if a server is sensitive to the TLS extension order.

  1. https://github.com/4cq2/googleplay/blob/v1.1.2/auth.go#L137-L165

@mikeage
Copy link
Author

mikeage commented Apr 23, 2023

Regardless of anything else, that's a nice utility (googleplay). Thanks :-)

@NgoHuy
Copy link

NgoHuy commented May 31, 2023

with ECH, we cannot calculate JA3.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants