Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error: 14 UNAVAILABLE: 403:Forbidden #1814

Open
iAMkVIN-S opened this issue Jan 19, 2023 · 30 comments
Open

Error: 14 UNAVAILABLE: 403:Forbidden #1814

iAMkVIN-S opened this issue Jan 19, 2023 · 30 comments
Assignees
Labels
api: firestore Issues related to the googleapis/nodejs-firestore API. priority: p2 Moderately-important priority. Fix may not be included in next release. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.

Comments

@iAMkVIN-S
Copy link

Describe your environment

  • Operating System version: Ubuntu 20.04.1 LTS
  • Firebase SDK version: 11.4.1
  • Firebase Product: Firebase Admin
  • Node.js version: 18.13.0
  • NPM version: 8.19.3
    (Used as a subdependency)
  • @google-cloud/firestore version: 6.4.2

Describe the problem

Steps to reproduce:

Our NodeJS backend, running ExpressJS and using Firebase Admin SDK has been running smoothly for 2 years. All of a sudden, on the morning of the 17th of Jan, our service was interupted by a seemignly routine Firebase call made by our baseModel. This causes all requests to fail until we reboot the backend. Since then, the error has happened randomly every 2-6+ hours with thousands of OK requests in between. The error seems to stem from the Firebase Admin SDK package/its depencies.

Relevant Code:

Error Log:

Jan 17 10:02:08 Error: 14 UNAVAILABLE: 403:Forbidden
Jan 17 10:02:08 at Object.callErrorFromStatus (/app/node_modules/google-gax/node_modules/@grpc/grpc-js/build/src/call.js:31:19)
Jan 17 10:02:08 at Object.onReceiveStatus (/app/node_modules/google-gax/node_modules/@grpc/grpc-js/build/src/client.js:352:49)
Jan 17 10:02:08 at Object.onReceiveStatus (/app/node_modules/google-gax/node_modules/@grpc/grpc-js/build/src/client-interceptors.js:328:181)
Jan 17 10:02:08 at /app/node_modules/google-gax/node_modules/@grpc/grpc-js/build/src/call-stream.js:188:78
Jan 17 10:02:08 at processTicksAndRejections (internal/process/task_queues.js:82:9)
Jan 17 10:02:08 for call at
Jan 17 10:02:08 at ServiceClientImpl.makeServerStreamRequest (/app/node_modules/google-gax/node_modules/@grpc/grpc-js/build/src/client.js:336:30)
Jan 17 10:02:08 at ServiceClientImpl.<anonymous> (/app/node_modules/google-gax/node_modules/@grpc/grpc-js/build/src/make-client.js:105:19)
Jan 17 10:02:08 at /app/node_modules/@google-cloud/firestore/build/src/v1/firestore_client.js:205:29
Jan 17 10:02:08 at /app/node_modules/google-gax/build/src/streamingCalls/streamingApiCaller.js:38:28
Jan 17 10:02:08 at /app/node_modules/google-gax/build/src/normalCalls/timeout.js:44:16
Jan 17 10:02:08 at Object.request (/app/node_modules/google-gax/build/src/streamingCalls/streaming.js:126:40)
Jan 17 10:02:08 at Timeout.makeRequest [as _onTimeout] (/app/node_modules/retry-request/index.js:139:28)
Jan 17 10:02:08 at listOnTimeout (internal/timers.js:531:17)
Jan 17 10:02:08 at processTimers (internal/timers.js:475:7)
Jan 17 10:02:08 Caused by: Error
Jan 17 10:02:08 at Firestore.getAll (/app/node_modules/@google-cloud/firestore/build/src/index.js:902:23)
Jan 17 10:02:08 at DocumentReference.get (/app/node_modules/@google-cloud/firestore/build/src/reference.js:211:32)
Jan 17 10:02:08 at Function.<anonymous> (/app/dist/models/baseModel.js:278:62)
Jan 17 10:02:08 at Generator.next (<anonymous>)
Jan 17 10:02:08 at /app/dist/models/baseModel.js:31:71
Jan 17 10:02:08 at new Promise (<anonymous>)
Jan 17 10:02:08 at __awaiter (/app/dist/models/baseModel.js:27:12)
Jan 17 10:02:08 at Function.findById (/app/dist/models/baseModel.js:274:16)
Jan 17 10:02:08 at /app/dist/routes/routeUtils.js:62:42
Jan 17 10:02:08 at Generator.next (<anonymous>) {
Jan 17 10:02:08 code: 14,
Jan 17 10:02:08 details: '403:Forbidden',
Jan 17 10:02:08 metadata: Metadata {
Jan 17 10:02:08 internalRepr: Map {
Jan 17 10:02:08 'content-type' => [Array],
Jan 17 10:02:08 'content-length' => [Array],
Jan 17 10:02:08 'date' => [Array],
Jan 17 10:02:08 'alt-svc' => [Array]
Jan 17 10:02:08 },
Jan 17 10:02:08 options: {}
Jan 17 10:02:08 }
Jan 17 10:02:08 }

Trickled down baseModel line from our code that does the Firebase fetch:

   static findById(id, showDeleted = false) {
        return __awaiter(this, void 0, void 0, function* () {
            try {
                if (!id)
                    throw new Error(`Invalid id provided for model ${this.name}`);
                const result = yield this.collection.doc(id).get();
                if (!result.exists)
                    return null;
                if (!showDeleted && result.data().isDeleted)
                    return null;
                return new this(Object.assign({}, result.data()));
            }
            catch (e) {
                console.error(e);
            }
        });
    }

Our package.json:

"dependencies": {
        "@sendgrid/mail": "^7.7.0",
        "axios": "^1.2.3",
        "cors": "^2.8.5",
        "crypto-js": "^4.1.1",
        "express": "^4.17.1",
        "firebase-admin": "^11.4.1",
        "lodash": "^4.17.21",
        "moment": "^2.29.4",
        "morgan": "^1.10.0",
        "nanoid": "^3.3.4",
        "stripe": "^11.6.0",
        "yup": "^0.32.11"
    },
    "devDependencies": {
        "@types/cors": "^2.8.13",
        "@types/crypto-js": "^4.1.1",
        "@types/express": "^4.17.15",
        "@types/lodash": "^4.14.191",
        "@types/morgan": "^1.9.4",
        "@types/node": "^18.11.18",
        "@types/yup": "^0.32.0",
        "@typescript-eslint/eslint-plugin": "^4.31.2",
        "@typescript-eslint/parser": "^4.31.2",
        "eslint": "^7.32.0",
        "nodemon": "^2.0.4",
        "ts-node": "^10.9.1",
        "typescript": "^4.9.4"
    }
@iAMkVIN-S iAMkVIN-S added priority: p2 Moderately-important priority. Fix may not be included in next release. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns. labels Jan 19, 2023
@product-auto-label product-auto-label bot added the api: firestore Issues related to the googleapis/nodejs-firestore API. label Jan 19, 2023
@MarkDuckworth MarkDuckworth self-assigned this Jan 19, 2023
@MarkDuckworth
Copy link
Contributor

Thanks for reporting. Looks like this issue may have also been reported through a support request. Can you confirm if that's true so we can coordinate? Thanks

@iAMkVIN-S
Copy link
Author

Thank you for the prompt reply!

@MarkDuckworth Correct.

I am sorry if that was the wrong thing to do.
We've been getting desperate as the error has now been happpening almost hourly. Service is therefore interrupted quite regularly and has escalated on our side to the highest priority.

Cheers

@MarkDuckworth
Copy link
Contributor

I just wanted to confirm it is the same report, so that we can coordinate internally. Thank you.

I spoke with an engineer from the backend about the error code, but we're not yet sure what's causing it.

Can you double check the project ID you provided on the other support channel and send an update there if needed? Also, can you help us understand how often this call to Firestore is made successfully?

@iAMkVIN-S
Copy link
Author

I have sent the project ID along through the support email thread.

This call to firestore is successfully made thousands of times inbetween getting the error.

I've also looked at the USAGE and it looks like we've average 21 Million+ of these READ requests per month and have been running essentially this configuration for about 2 years without ever seeing this error prior.

image

First encounter of this error was on Jan 17th at 5:02am [EST] while running a version of our backend that had not received a deploy in 10 days. (meaning nothing changed) On the 18th of Jan the error happened 3-4 times. Today, on the 19th of Jan, we've gotten the error a little over once per hour.

@iAMkVIN-S
Copy link
Author

iAMkVIN-S commented Jan 20, 2023

Important update:

This is now happening on another backend of ours that is running firebase-admin

The last deploy of this backend was from 6 weeks ago, with a non significant change. This has also been running for years without an issue prior. An important difference here, is rebooting doesn't seem to clear the error for a little bit, we've been unable to bring it back up and if this happens to our other backend, we're going to be in trouble!

Jan 20 10:08:51 (node:1) UnhandledPromiseRejectionWarning: Error: 14 UNAVAILABLE: 403:Forbidden
Jan 20 10:08:51 at Object.callErrorFromStatus (/usr/src/app/node_modules/@grpc/grpc-js/build/src/call.js:30:26)
Jan 20 10:08:51 at Object.onReceiveStatus (/usr/src/app/node_modules/@grpc/grpc-js/build/src/client.js:328:49)
Jan 20 10:08:51 at Object.onReceiveStatus (/usr/src/app/node_modules/@grpc/grpc-js/build/src/client-interceptors.js:304:181)
Jan 20 10:08:51 at Http2CallStream.outputStatus (/usr/src/app/node_modules/@grpc/grpc-js/build/src/call-stream.js:116:74)
Jan 20 10:08:51 at Http2CallStream.maybeOutputStatus (/usr/src/app/node_modules/@grpc/grpc-js/build/src/call-stream.js:155:22)
Jan 20 10:08:51 at Http2CallStream.endCall (/usr/src/app/node_modules/@grpc/grpc-js/build/src/call-stream.js:141:18)
Jan 20 10:08:51 at Http2CallStream.handleTrailers (/usr/src/app/node_modules/@grpc/grpc-js/build/src/call-stream.js:266:14)
Jan 20 10:08:51 at ClientHttp2Stream.<anonymous> (/usr/src/app/node_modules/@grpc/grpc-js/build/src/call-stream.js:315:26)
Jan 20 10:08:51 at ClientHttp2Stream.emit (events.js:200:13)
Jan 20 10:08:51 at emit (internal/http2/core.js:247:8)
Jan 20 10:08:51 Caused by: Error
Jan 20 10:08:51 at Query._get (/usr/src/app/node_modules/@google-cloud/firestore/build/src/reference.js:1485:23)
Jan 20 10:08:51 at Query.get (/usr/src/app/node_modules/@google-cloud/firestore/build/src/reference.js:1474:21)
Jan 20 10:08:51 at Function.find (/usr/src/app/dist/models/BaseModel.js:362:48)
Jan 20 10:08:51 at /usr/src/app/dist/app.js:43:27
Jan 20 10:08:51 at Layer.handle [as handle_request] (/usr/src/app/node_modules/express/lib/router/layer.js:95:5)
Jan 20 10:08:51 at next (/usr/src/app/node_modules/express/lib/router/route.js:137:13)
Jan 20 10:08:51 at Route.dispatch (/usr/src/app/node_modules/express/lib/router/route.js:112:3)
Jan 20 10:08:51 at Layer.handle [as handle_request] (/usr/src/app/node_modules/express/lib/router/layer.js:95:5)
Jan 20 10:08:51 at /usr/src/app/node_modules/express/lib/router/index.js:281:22
Jan 20 10:08:51 at Function.process_params (/usr/src/app/node_modules/express/lib/router/index.js:335:12)
Jan 20 10:08:51 (node:1) UnhandledPromiseRejectionWarning: Unhandled promise rejection. This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). (rejection id: 6058)

@MarkDuckworth
Copy link
Contributor

Are you using App Check, or is that enabled for your Firebase app?

Also, can you tell me more about what happens after this first error? Are subsequent requests failing with the same error?

@iAMkVIN-S
Copy link
Author

iAMkVIN-S commented Jan 20, 2023

Thanks for the reply @MarkDuckworth .

  1. We are not using App Check on any of the two projects that use firebase-admin that are currently crashing.

  2. As soon as we get the error, any following requests will receive this error (and therefore, the backend logs become a spam wall of this error).

We have to restart the backend, which sometimes the error will happen at the first request, sometimes we'll be fine for 45 minutes.

(Maybe there is a timeout or cooldown on your side that isn't reached when we reboot too quickly? or maybe it's random)

@MarkDuckworth
Copy link
Contributor

@iAMkVIN-S, I'm recommending that you try to resolve this through the support ticket you have. That will likely be preferred moving forward because it will allow you to share more information privately. There seem to be more eyes on that conversation as well. For us, it reduces duplicate work. If you disagree, please feel free to re-open this conversation.

@iAMkVIN-S
Copy link
Author

@MarkDuckworth Thank you so much.

I've responded via the support request and will continue to do so there since the fix is unlikely to be done by our side.

I hope this thread is useful for others experiencing the same error to know that the team at Firebase/Google are taking the issue seriously.

Cheers

@ebeloin
Copy link

ebeloin commented Jan 25, 2023

Describe your environment

  • Operating System version: Ubuntu 20.04.1 LTS
  • Firebase SDK version: 11.4.1
  • Firebase Product: Firebase Admin
  • Node.js version: 18.13.0
  • NPM version: 8.19.3
    (Used as a subdependency)
  • @google-cloud/firestore version: 6.4.2

Describe the problem

Steps to reproduce:

Our NodeJS backend, running ExpressJS and using Firebase Admin SDK has been running smoothly for 2 years. All of a sudden, on the morning of the 17th of Jan, our service was interupted by a seemignly routine Firebase call made by our baseModel. This causes all requests to fail until we reboot the backend. Since then, the error has happened randomly every 2-6+ hours with thousands of OK requests in between. The error seems to stem from the Firebase Admin SDK package/its depencies.

Relevant Code:

Error Log:

Jan 17 10:02:08 Error: 14 UNAVAILABLE: 403:Forbidden
Jan 17 10:02:08 at Object.callErrorFromStatus (/app/node_modules/google-gax/node_modules/@grpc/grpc-js/build/src/call.js:31:19)
Jan 17 10:02:08 at Object.onReceiveStatus (/app/node_modules/google-gax/node_modules/@grpc/grpc-js/build/src/client.js:352:49)
Jan 17 10:02:08 at Object.onReceiveStatus (/app/node_modules/google-gax/node_modules/@grpc/grpc-js/build/src/client-interceptors.js:328:181)
Jan 17 10:02:08 at /app/node_modules/google-gax/node_modules/@grpc/grpc-js/build/src/call-stream.js:188:78
Jan 17 10:02:08 at processTicksAndRejections (internal/process/task_queues.js:82:9)
Jan 17 10:02:08 for call at
Jan 17 10:02:08 at ServiceClientImpl.makeServerStreamRequest (/app/node_modules/google-gax/node_modules/@grpc/grpc-js/build/src/client.js:336:30)
Jan 17 10:02:08 at ServiceClientImpl.<anonymous> (/app/node_modules/google-gax/node_modules/@grpc/grpc-js/build/src/make-client.js:105:19)
Jan 17 10:02:08 at /app/node_modules/@google-cloud/firestore/build/src/v1/firestore_client.js:205:29
Jan 17 10:02:08 at /app/node_modules/google-gax/build/src/streamingCalls/streamingApiCaller.js:38:28
Jan 17 10:02:08 at /app/node_modules/google-gax/build/src/normalCalls/timeout.js:44:16
Jan 17 10:02:08 at Object.request (/app/node_modules/google-gax/build/src/streamingCalls/streaming.js:126:40)
Jan 17 10:02:08 at Timeout.makeRequest [as _onTimeout] (/app/node_modules/retry-request/index.js:139:28)
Jan 17 10:02:08 at listOnTimeout (internal/timers.js:531:17)
Jan 17 10:02:08 at processTimers (internal/timers.js:475:7)
Jan 17 10:02:08 Caused by: Error
Jan 17 10:02:08 at Firestore.getAll (/app/node_modules/@google-cloud/firestore/build/src/index.js:902:23)
Jan 17 10:02:08 at DocumentReference.get (/app/node_modules/@google-cloud/firestore/build/src/reference.js:211:32)
Jan 17 10:02:08 at Function.<anonymous> (/app/dist/models/baseModel.js:278:62)
Jan 17 10:02:08 at Generator.next (<anonymous>)
Jan 17 10:02:08 at /app/dist/models/baseModel.js:31:71
Jan 17 10:02:08 at new Promise (<anonymous>)
Jan 17 10:02:08 at __awaiter (/app/dist/models/baseModel.js:27:12)
Jan 17 10:02:08 at Function.findById (/app/dist/models/baseModel.js:274:16)
Jan 17 10:02:08 at /app/dist/routes/routeUtils.js:62:42
Jan 17 10:02:08 at Generator.next (<anonymous>) {
Jan 17 10:02:08 code: 14,
Jan 17 10:02:08 details: '403:Forbidden',
Jan 17 10:02:08 metadata: Metadata {
Jan 17 10:02:08 internalRepr: Map {
Jan 17 10:02:08 'content-type' => [Array],
Jan 17 10:02:08 'content-length' => [Array],
Jan 17 10:02:08 'date' => [Array],
Jan 17 10:02:08 'alt-svc' => [Array]
Jan 17 10:02:08 },
Jan 17 10:02:08 options: {}
Jan 17 10:02:08 }
Jan 17 10:02:08 }

Trickled down baseModel line from our code that does the Firebase fetch:

   static findById(id, showDeleted = false) {
        return __awaiter(this, void 0, void 0, function* () {
            try {
                if (!id)
                    throw new Error(`Invalid id provided for model ${this.name}`);
                const result = yield this.collection.doc(id).get();
                if (!result.exists)
                    return null;
                if (!showDeleted && result.data().isDeleted)
                    return null;
                return new this(Object.assign({}, result.data()));
            }
            catch (e) {
                console.error(e);
            }
        });
    }

Our package.json:

"dependencies": {
        "@sendgrid/mail": "^7.7.0",
        "axios": "^1.2.3",
        "cors": "^2.8.5",
        "crypto-js": "^4.1.1",
        "express": "^4.17.1",
        "firebase-admin": "^11.4.1",
        "lodash": "^4.17.21",
        "moment": "^2.29.4",
        "morgan": "^1.10.0",
        "nanoid": "^3.3.4",
        "stripe": "^11.6.0",
        "yup": "^0.32.11"
    },
    "devDependencies": {
        "@types/cors": "^2.8.13",
        "@types/crypto-js": "^4.1.1",
        "@types/express": "^4.17.15",
        "@types/lodash": "^4.14.191",
        "@types/morgan": "^1.9.4",
        "@types/node": "^18.11.18",
        "@types/yup": "^0.32.0",
        "@typescript-eslint/eslint-plugin": "^4.31.2",
        "@typescript-eslint/parser": "^4.31.2",
        "eslint": "^7.32.0",
        "nodemon": "^2.0.4",
        "ts-node": "^10.9.1",
        "typescript": "^4.9.4"
    }

I have the same issue. Please help us fix this.

@MarkDuckworth
Copy link
Contributor

Thanks for brining that to our attention @manoslive. I'll ping the internal thread and let them know it is affecting other users.

@iAMkVIN-S, are you still experiencing this?

@MarkDuckworth
Copy link
Contributor

@manoslive, can you tell us when this error started occurring for you?

@ebeloin
Copy link

ebeloin commented Jan 25, 2023

I first noticed it on 01/17/23 at 1:57pm.

Also, I have issue starting it back up. Everytime I reboot i get the same error. Currently, we can't use our backend.

Here's some logs from my node.js backend:

(node:1) UnhandledPromiseRejectionWarning: Error: 14 UNAVAILABLE: 403:Forbidden
at Object.callErrorFromStatus (/usr/src/app/node_modules/google-gax/node_modules/@grpc/grpc-js/build/src/call.js:31:26)
at Object.onReceiveStatus (/usr/src/app/node_modules/google-gax/node_modules/@grpc/grpc-js/build/src/client.js:330:49)
at Object.onReceiveStatus (/usr/src/app/node_modules/google-gax/node_modules/@grpc/grpc-js/build/src/client-interceptors.js:299:181)
at /usr/src/app/node_modules/google-gax/node_modules/@grpc/grpc-js/build/src/call-stream.js:145:78
at processTicksAndRejections (internal/process/task_queues.js:82:9)
Caused by: Error
at Query._get (/usr/src/app/node_modules/@google-cloud/firestore/build/src/reference.js:1485:23)
at Query.get (/usr/src/app/node_modules/@google-cloud/firestore/build/src/reference.js:1474:21)
at Function.find (/usr/src/app/dist/api/requests.js:420:35)
at /usr/src/app/dist/app.js:43:27
at Layer.handle [as handle_request] (/usr/src/app/node_modules/express/lib/router/layer.js:95:5)
at next (/usr/src/app/node_modules/express/lib/router/route.js:137:13)
at Route.dispatch (/usr/src/app/node_modules/express/lib/router/route.js:112:3)
at Layer.handle [as handle_request] (/usr/src/app/node_modules/express/lib/router/layer.js:95:5)
at /usr/src/app/node_modules/express/lib/router/index.js:281:22
at Function.process_params (/usr/src/app/node_modules/express/lib/router/index.js:335:12)
(node:1) UnhandledPromiseRejectionWarning: Unhandled promise rejection. This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). (rejection id: 26)

@iAMkVIN-S
Copy link
Author

iAMkVIN-S commented Jan 25, 2023

@MarkDuckworth We are very much still getting the issue. This has become mission critical as its impacting our clients businesses with so much downtime.

Thankfully our backend does reboot when we get the error and sometimes get away with a couple hundreds thousand OK requests before we randomly get the error again, though not always on the first go.

Since Saturday Jan 21st in the early AM, we've logged everytime a reboot was necessary due to the error:

266 times in 96 hours. This is ridiculous. How did we process 250 Million + requests over the last 2 years from this one backend alone, and without making any changes, woke up on Jan 17th to non-stop crashing.

(timestamps in milliseconds)

RESTARTS: (266)
1674289037983000
1674291718989000
1674292718477000
1674292831277000
1674293015488000
1674293207867000
1674293351222000
1674293482690000
1674293671756000
1674293925832000
1674294691855000
1674295176718000
1674295368606000
1674302961810000
1674310266033000
1674313970101000
1674314074095000
1674317780255000
1674320533465000
1674323402254000
1674323508347000
1674327216832000
1674330920530000
1674331027280000
1674334735600000
1674342048007000
1674342156709000
1674342263106000
1674342366656000
1674349721840000
1674349827806000
1674353536885000
1674353643205000
1674353755830000
1674353862040000
1674357585725000
1674361293747000
1674361408919000
1674361522736000
1674368903827000
1674370885206000
1674372981068000
1674373297597000
1674378425390000
1674378560965000
1674378689308000
1674379576327000
1674380770858000
1674381065547000
1674381366523000
1674381489429000
1674381778539000
1674381925985000
1674382068010000
1674382204064000
1674382470922000
1674382610478000
1674386339199000
1674386991775000
1674390691809000
1674390804020000
1674391161370000
1674391285930000
1674391412342000
1674391520708000
1674392531052000
1674392656840000
1674393593907000
1674393718321000
1674397443718000
1674397562319000
1674401405814000
1674401513179000
1674401636493000
1674414132380000
1674421437700000
1674421542385000
1674425242982000
1674428944012000
1674429046910000
1674432751493000
1674436484620000
1674440191661000
1674447512418000
1674447618271000
1674451333586000
1674451446735000
1674451565913000
1674455457640000
1674455581824000
1674458298975000
1674458449492000
1674459018662000
1674459141174000
1674459430226000
1674459590105000
1674461522378000
1674461658890000
1674462242309000
1674464055770000
1674464333230000
1674468106528000
1674471818514000
1674471925167000
1674472030824000
1674473252294000
1674473358435000
1674473462695000
1674473568298000
1674473672535000
1674473778392000
1674473883947000
1674474374371000
1674474480382000
1674478188324000
1674478293910000
1674478401257000
1674478505642000
1674478612956000
1674478719303000
1674478821309000
1674478924334000
1674482638673000
1674484151501000
1674486801540000
1674490522969000
1674490653601000
1674490777917000
1674490883999000
1674493345650000
1674493467698000
1674493583417000
1674497303337000
1674501010890000
1674508329499000
1674508453084000
1674515759076000
1674515882091000
1674519601762000
1674519708093000
1674523155931000
1674523281901000
1674524492775000
1674524618280000
1674524735337000
1674528462710000
1674528591823000
1674528713547000
1674528838422000
1674532585741000
1674532708146000
1674536415733000
1674536581604000
1674536737057000
1674538504041000
1674538644874000
1674539028197000
1674539217798000
1674539356701000
1674539484498000
1674543149732000
1674545035427000
1674545930536000
1674547972890000
1674548133400000
1674548289753000
1674549122357000
1674549798045000
1674550204086000
1674550594478000
1674550718495000
1674554486809000
1674554613388000
1674554737501000
1674557057843000
1674557188193000
1674560704323000
1674560818521000
1674560940761000
1674561063958000
1674561190032000
1674561304001000
1674565008898000
1674565134900000
1674565264118000
1674565387570000
1674569093243000
1674572818426000
1674572942317000
1674573069250000
1674573193383000
1674573303982000
1674578422675000
1674579731853000
1674583450489000
1674583574515000
1674587718965000
1674587835103000
1674587945272000
1674591660099000
1674591782559000
1674595489365000
1674595608926000
1674599325835000
1674599453186000
1674603168959000
1674603293688000
1674603417479000
1674607127323000
1674607243646000
1674607355700000
1674618287483000
1674618402394000
1674618524873000
1674618629396000
1674622386776000
1674624004053000
1674624125431000
1674626961823000
1674627098267000
1674627226830000
1674627381640000
1674628270592000
1674628559700000
1674630545011000
1674633704752000
1674635882699000
1674636377943000
1674638420986000
1674638542684000
1674638652167000
1674642369224000
1674642491417000
1674642615681000
1674642740182000
1674646504005000
1674646628981000
1674646756498000
1674646882099000
1674647235196000
1674647359336000
1674651450683000
1674651572737000
1674651697308000
1674655419030000
1674655537158000
1674655660877000
1674655782904000
1674655906452000
1674656030740000
1674656140539000
1674659863790000
1674659991764000
1674663708063000
1674663834388000
1674663958937000
1674664085315000
1674664203979000
1674664326312000
1674664451834000
1674666138411000
1674666263919000
1674666387484000
1674666508982000
1674666615555000
1674666735555000

@MarkDuckworth
Copy link
Contributor

Thanks for the update. I communicated this to the backend team. @manoslive, if you can provide a project ID, that will help them diagnose. You may want to create a support ticket where you can provide your project ID. In that ticket reference internal issue b/266097991, so we can link your info to the right place.

@ebeloin
Copy link

ebeloin commented Jan 26, 2023

Thanks for the update. I communicated this to the backend team. @manoslive, if you can provide a project ID, that will help them diagnose. You may want to create a support ticket where you can provide your project ID. In that ticket reference internal issue b/266097991, so we can link your info to the right place.

Done. Thanks for the reply.

@dconeybe
Copy link
Contributor

Here are two things that you could try which might mitigate the issue until we find the root cause and/or implement a fix.

Idea 1: Try the new "enable rest" option.

The networking errors are coming from the grpc networking layer. We use grpc in the sdk because it provides the complex bi-directional communication infrastructure that is needed for streaming queries (e.g. onSnapshot() callbacks). However, it looks like the queries that are causing you problems are not streaming queries, but rather just one time request/response queries. These "simpler" queries can be done using a simpler networking protocol that we call "rest". These "rest" requests may even hit different Google edge nodes that potentially are not affected by this permissions issue.

@MarkDuckworth recently added a new "prefer rest" option to the sdk which I think you should try. By enabling this new option, the requests will be made to the backend using the "rest" protocol, until a request for bi-directional streaming (i.e. onSnapshot()) is made, at which point the sdk will start using grpc and will only use grpc thereafter.

To use the new "prefer rest" option, use this code to initialize the Firestore object: initializeFirestore(app, { preferRest: true }) (more details here: firebase/firebase-admin-node#1901 (comment)). Just note that as soon as you call onSnapshot() that grpc will be used thereafter.

Idea 2: Discard clients that fail with this error

The SDK has some logic to stop using clients that fail with the RST_STREAM error (

.catch(async (err: GoogleError) => {
if (err.message?.match(/RST_STREAM/)) {
// Once a client has seen a RST_STREAM error, the GRPC channel can
// no longer be used. We mark the client as failed, which ensures that
// we open a new GRPC channel for the next request.
this.failedClients.add(client);
}
await this.release(requestTag, client);
return Promise.reject(err);
})
). You could try to activate this logic on the permissions errors that you have been experiencing too. You'd have to hack patch this fix in, but it would be great to know if it works for you or not.

Here is how you would do it:

  1. Grep through your node_modules directory for the string RST_STREAM.
  2. Replace the occurrences of RST_STREAM in the err.message?.match(/RST_STREAM/) (which may look difference once it's transpiled into JavaScript) to UNAVAILABLE or Forbidden.
  3. Rebuild your app to pick up the changes.

It would be really useful to know if discarding the clients that experience this permissions error recover when a new connection is opened.

@iAMkVIN-S
Copy link
Author

iAMkVIN-S commented Feb 1, 2023

Good morning @dconeybe - Thanks a ****ton for the very useful reply.

We're so relieved that we weren't in fact doing anything particularly wrong and something had indeed changed on the Cloud side of things.

Within the hour of you posting last night, we started going through both your ideas, given how critical it was for the projects.
In testing thus far, we have not had an error since Tuesday, January 31, 2023 4:23:03 PM (0 since implementing Idea 1).
Thankfully we do not use snapshots on the backend.

As for Idea 2, it's not as straight forward for us to quickly do, since we're deploying a docker container. This means that the code gets uploaded to a Build Machine, from there everything is handled automatically within a Dockerfile and built to serve in a container. As mentioned above though, since we don't use snapshots on the backend side, it doesn't seem like we'll need to go further into Idea 2 if we continue not to crash by bypassing GRPC

I'll post an update later today

@iAMkVIN-S
Copy link
Author

@MarkDuckworth / @dconeybe - We've added another 24hrs without a crash on any of the projects from Idea 1.

It sounds like we have our temporary fix until a permanent solution on the Firebase side!

@dconeybe
Copy link
Contributor

dconeybe commented Feb 2, 2023

That's great news! Thanks for the update and for trying out the ideas. I'm glad we found something to get you unblocked.

One thing, though, is that since we cannot reproduce this issue ourselves, we may need you to reproduce it for us in order to investigate to find the root cause. Maybe consider how you might go about reproducing this problem in a way that doesn't grind your business to a halt.

We will get back to you with suggested next steps, either via this GitHub issue or via your support ticket.

@iAMkVIN-S
Copy link
Author

iAMkVIN-S commented Feb 2, 2023

@dconeybe It's unfortunate to hear that you are unable to reproduce it. Moreso, it's very suprising to me that we aren't seeing more of the 1.1M weekly installers of firebase-admin coming to the issue considering that our usage of this package is in its simplest and documented form.

We'll be more than happy to get any other data you need to figure out the exact root cause, please do let us know what you'd like us to do! The team is at your disposal.

@PL47productions
Copy link

@iAMkVIN-S Count me as one of the 1.1M weekly installers. My cloud functions have been running smoothly for months, then yesterday I started receiving 403 Forbidden error messages with all of my functions that make calls to an outside API. This is a critical issue as the cloud functions are core to my application.

Below is one of my functions that began failing yesterday. I first have to retrieve a token from my firestore and then make the request to the outside API (Finicity). I have confirmed the url and parameters are all generated and passed correctly in the logs of each failed attempt. Once the url is received, it is passed back to the end user.

It's important to note that when running my functions from a firebase shell environment, everything runs smoothly and the calls succeed . However, when I invoke the functions from my application as an end user, I receive the 403 Forbidden error message. To that end, I also have a function that runs every 80 minutes making a call to an outside api that, again runs fine when I invoke it locally, but when it is called automatically, it just started failing yesterday nearly every time.

Sample Function
` exports.refreshPromiseCustomUrl = functions.https.onCall(async(data,context) => {

   const customerId = data.customerId;
  
    const refreshedUrl = await admin.firestore().collection('SampleCollection').doc('sampleDoc').get().then(doc => {
      const promises = [];
      promises.push(doc.data().accessToken);
      return Promise.all(promises)    
    }).then(async token => {
      const secondPromise = []
   
      const dataForUrl = JSON.stringify({"partnerId":"MyPartnerId","customerId":customerId});
      const configForUrl = {method: 'post',url: 'https://api.finicity.com/connect/v2/generate',
      headers: {'Accept': 'application/json',
           'Content-Type': 'application/json', 
          'Finicity-App-Key':'MyAppKey',
          'Finicity-App-Token': token[0],   
          },
        data : dataForUrl
      };
    
      const theRequestUrl = await axios(configForUrl).then(res => {
        return res.data.link
      }).catch(e => {
        functions.logger.error('Error reaching out to finicity, the token: ', token[0], 'customer Id: ', customerId, 'error message: ', e)
        return ''
      })

      secondPromise.push(theRequestUrl)
      return Promise.all(secondPromise);
   
    }).catch(err => {
      functions.logger.error('error  ', err)
      return '';
    })
  
  return{theUrl: refreshedUrl[0]}
  
  })`

Error Message abbreviated
[Request failed with status code 403 ](response: { status: 403, statusText: 'Forbidden',)

package.json
{ "name": "functions", "description": "Cloud Functions for Firebase", "scripts": { "lint": "eslint .", "serve": "firebase serve --only functions", "shell": "firebase functions:shell", "start": "npm run shell", "deploy": "firebase deploy --only functions", "logs": "firebase functions:log" }, "engines": { "node": "16" }, "dependencies": { "@sendgrid/mail": "^6.5.5", "axios": "^1.3.2", "cors": "^2.8.5", "firebase-admin": "^11.5.0", "firebase-functions": "^4.2.1", "googleapis": "^107.0.0", "moment": "2.26.0", "nodemailer": "^2.5.0", "stripe": "^6.36.0", "uuid": "^3.3.2" }, "devDependencies": { "eslint": "^5.12.0", "eslint-plugin-promise": "^4.0.1" }, "private": true }

I feel I have exhausted my options at this point. I have updated all of my dependencies, tried variations of my functions, checked all of my parameters for the api call are set correctly, confirmed with Finicity that nothing has changed on their end, confirmed user access to my functions, tried using different function names, tried increasing the memory and instances allocated to each function...nothing has worked.

I will note, however, that randomly every once in a while a call will go through successfully. No idea why. I am absolutely stuck at this point.

Just hoping this issue gets resolved as soon as possible.

@iAMkVIN-S
Copy link
Author

iAMkVIN-S commented Feb 9, 2023

Hey @PL47productions

I'm not sure if your error is exactly the same as we were getting (Error: 14 UNAVAILABLE: 403:Forbidden)

However, did you try Idea 1 from @dconeybe above?

Where you Initialize Firebase Admin, you can pass in a param that tells the SDK to use the REST API when the request is not a snapshot.

What we had before the change:

admin.initializeApp({
    credential: <Your Credentials JSON>,
});

What we had after the change:

import { initializeApp } from "firebase-admin/app";
import { initializeFirestore } from "firebase-admin/firestore";

const app = initializeApp({
    credential: <Your Credentials JSON>,
});
initializeFirestore(app, { preferRest: true });

If your error is the same as ours, this will fix it temporarily, since I do not see a snapshot in your above request.

P.S. The error isn't from an external request, it's from the .get() where you fetch from Firebase on line 2.

@MarkDuckworth
Copy link
Contributor

@PL47productions, thanks for letting us know that this is also impacting you. I updated the internal conversation with this information.

As @iAMkVIN-S mentioned, using preferRest may provide a workaround. You can consider using preferRest if you don't use onSnapshot(...) in your application.

@PL47productions
Copy link

@iAMkVIN-S Thanks and yes, not exactly the same issue...but in the same ball park. All functions running smoothly for months, no changes, then bam a bunch of 403 forbidden errors out of no where. Unfortunately your suggestion didn't do the trick for me.

@MarkDuckworth Thanks, this is a frustrating one for sure. I'm testing the unirest library out instead of axios to see if that helps at all.

@dconeybe
Copy link
Contributor

@PL47productions Could you reproduce with debug logging enabled and provide the logs via a https://gist.github.com/ link? If you're not comfortable sharing the logs publicly, please open a support ticket at https://firebase.google.com/support where the logs can be submitted privately.

First, enable debug logging by registering a logging function (you can probably just use console.log):

https://cloud.google.com/nodejs/docs/reference/firestore/latest/overview#functions

Second, enable verbose GRPC logging by setting GRPC_VERBOSITY set to debug:

https://github.com/grpc/grpc-node/blob/master/TROUBLESHOOTING.md

@PL47productions
Copy link

https://gist.github.com/PL47productions/87422a54807d6280462920f7e0cb2d01

The above link is the error output I receive for the function below that was running fine for months.
` exports.refreshFinToken = functions.pubsub.schedule('every 80 minutes').onRun(async (context) => {

  const data = JSON.stringify({"partnerId":"MYID", "partnerSecret":"MYSECRET"});

 
   let config = {
     method: 'post',
     url: 'https://api.finicity.com/aggregation/v2/partners/authentication',
     headers: { 
       'Accept': 'application/json',
       'Content-Type': 'application/json',      
       'Finicity-App-Key':'MYKEY'
     },
     data : data
   };
 

 const updatedToken = await axios(config).then(response =>  {
     const promises = [];
     promises.push(response.data.token);
     
     

   
      
       return Promise.all(promises);
       
     }).catch(e => {
      functions.logger.error('Error retrieving token from finicity, error message: ', e)
  
      return '';
     }).then(async token => {
       const secondPromise = [];
       secondPromise.push(token[0])
       let tokenString = String(token[0])
       
       if(typeof token[0] === 'undefined'){
        sendTokenError();
        //handleRetryGetAccessToken()
        return Promise.all(secondPromise);
       }
       else {

        let updateResult =  await admin.firestore().collection('COLLECTION').doc('DOC').update({accessToken: tokenString}).then((res) => {
    
      
          return 'Successfully updated';
           }).catch((e) => {
            functions.logger.error('Error adding token to data base, token: ',token[0], 'error: ', e)
   
            return 'Error adding token to database';
           })
          
           return Promise.all(updateResult)
       }

     }).catch((e) => {
   
       return 'Encountered error'
     })
 

 
 
   
 
   return{success: updatedToken}
 
 })`

I'll try enabling verbose logging and send those along.

@dconeybe
Copy link
Contributor

@PL47productions The error message from your logs look like the problem is connecting to https://api.finicity.com. This doesn't appear to be related to Firestore (this GitHub repository is for Firestore). Are you perhaps suggesting that this is an issue beyond Firestore and also affects other, unrelated products? That would make sense to me because otherwise we would be experiencing a massive Firestore outage rather than one or two customers experiencing this "403: Forbidden" error. Maybe there is some common infrastructure that coincidentally both your traffic and OP's traffic go through which cause this error and the traffic never even makes it to Firestore servers. That would also make sense because the Firestore backend engineers could not find any trace of OP's "403: Forbidden" errors in our server logs.

@iAMkVIN-S
Copy link
Author

@dconeybe I don't think our error and @PL47productions are the same.

First, the error log is different.

Second, our failing requests were for a simple .get() from the Firebase Admin SDK. Without any external API in play.

Third, we've had 0 errors since adding the preferRest param in the Initialize.

We still are building up a sandbox backend to reproduce the error but with the additional logs requested in the support ticket. If time is on our side, we'll hopefully have that to you today.

@PL47productions
Copy link

Apologies, our issues are different indeed...mine is related to errors within firebase functions, not with firestore. I opened a support ticket with firebase and in the meantime, I added a region code to all of my calls which seems to have fixed the issue...or it just randomly started working again just as randomly as it stopped.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: firestore Issues related to the googleapis/nodejs-firestore API. priority: p2 Moderately-important priority. Fix may not be included in next release. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.
Projects
None yet
Development

No branches or pull requests

5 participants