WebSocket postToConnection fails with 403

Hello, I have a simple websocket implementation that is working offline but fails in AWS. I can successfully connect but am not able to send a response with ApiGatewayManagementApi.postToConnection(). I am able to log a mysterious 403 in CloudFront:

ForbiddenException
    at deserializeAws_restJson1PostToConnectionCommandError (/var/task/node_modules/@aws-sdk/client-apigatewaymanagementapi/dist/cjs/protocols/Aws_restJson1.js:287:41)
    at processTicksAndRejections (internal/process/task_queues.js:93:5)
    at async /var/task/node_modules/@aws-sdk/middleware-serde/dist/cjs/deserializerMiddleware.js:6:20
    at async /var/task/node_modules/@aws-sdk/middleware-signing/dist/cjs/middleware.js:12:24
    at async StandardRetryStrategy.retry (/var/task/node_modules/@aws-sdk/middleware-retry/dist/cjs/defaultStrategy.js:56:46)
    at async /var/task/node_modules/@aws-sdk/middleware-logger/dist/cjs/loggerMiddleware.js:6:22
    at async PubSub.send (/var/task/PubSub.js:73:24)
    at async Handler.wsConnection (/var/task/Handler.js:215:9) {
  '$fault': 'client',
  '$metadata': {
    httpStatusCode: 403,
    requestId: 'ae81fd40-74f8-4cf5-8da5-a896d8592941',
    extendedRequestId: undefined,
    cfId: undefined,
    attempts: 1,
    totalRetryDelay: 0
  }
}

I am setting up my ApiGatewayManagementApi (in my class constructor):

this.api = new ApiGatewayManagementApi({
  apiVersion: "2018-11-29",
  endpoint: props.apiGatewayEndpoint, // https://{id-from-serverless-deploy}.execute-api.ca-central-1.amazonaws.com/dev
});

And calling:

const params = {
  ConnectionId: connection, // Confirmed websocket connectionId
  Data: Buffer.from(JSON.stringify(res)), // Confirmed JSON payload
};

const retVal = await this.api.postToConnection(params);

This particular postToConnection() is invoked after a successful connection and thus the associated serverless.yml section is (and works locally)

wsConnection:
  handler: Handler.wsConnection
  events:
    - websocket:
        route: $connect
    - websocket:
        route: $disconnect

Just spent a couple more hours on this conundrum.

Still no success :frowning: I have tried setting IAM roles for the websocket API although surely that is what Serverless is supposed to do. It did not make any difference.

- Effect: Allow
      Action:
        - "execute-api:ManageConnections"
        - "execute-api:Invoke"
      Resource:
        - 'arn:aws:execute-api:ca-central-1:xxx:yyy/*/$connect'
        - 'arn:aws:execute-api:ca-central-1:xxx:yyy/*/$disconnect'
        - 'arn:aws:execute-api:ca-central-1:xxx:yyy/*/join'
        - 'arn:aws:execute-api:ca-central-1:xxx:yyy/*/$default'

My next step is to abandon the Serverless Framework and see if this can be done with AWS SAM CLI.

I am using this:

WSConnectionRespondPolicy:
Effect: Allow
Action:
- “execute-api:ManageConnections”
Resource:
- “arn:aws:execute-api:::**/@connections/”

With screenshot as the formattings get screwed up

I think you are using wron resource targets :slight_smile:
@tforster I have it up and running, happy to help if you need to talk it through: )

Not sure on your settings regarding endpoints either. When I deploy both API and WS on the same stack, they each get their own domain.
you can get it out by running

sls info

Sample output:

Config for the wsManagement api

    WEBSOCKET_API: !Ref WebsocketsApi

here

  return new ApiGatewayManagementApi({
    apiVersion: "2018-11-29",
    endpoint: `${process.env.WEBSOCKET_API}.execute-api.${process.env.AWS_REGION}.amazonaws.com/${process.env.STAGE}`,
  });

posting itself

  const postParams = {
    Data: JSON.stringify(body),
    ConnectionId: body.connectionID,
  };
  try {
    return await API_GW.postToConnection(postParams).promise();
  ....
  }

@jadczakd Thank you for responding!

Funnily enough, I stumbled across arn:aws:execute-api:::**/@connections/ yesterday and with that I was able to get things working. I just wish (like everyone) that AWS had better documentation.

I am seeing two different domains in sls info which bears out with a single mention I stumbled upon early in this exercise that said that API Gateway does not support HTTP and WS on the same domain.

Curious though, you have one line that says WEBSOCKET_API: !Ref WebsocketsApi what is that referencing?

The two most annoying aspects are the same two annoying aspects I have battled since the first release of the Serverless Framework.

  1. Terrible documentation from AWS
  2. The Serverless Offline Plugin is a poor approximation of the actual environment and does not enforce any permissions. It is so difficult to debug an AWS issue that cannot be reproduced locally because Serverless Offline lets you get away with just about anything.

Thanks for listening to my rant :wink:

And thank you for responding with the solution!

Curious though, you have one line that says WEBSOCKET_API: !Ref WebsocketsApi what is that referencing?

You can use this reference only when you have a WS api defined AFAIK it will reference the URI that you have to use for WS connectivity in short. I don’t have any object by that name anywhere in my ymls defined : )

To be honest I am entirely against testing serverless locally for any chain functions for example. In my opinion it works well only with single and simple req → res functions. Otherwise the overhead in maintaining a local virtual env is simply too big - think when you start using different sort of queues and database triggers to run your lambdas for example. Any time I am trying to use local browser with some methods behind a custom authorizer (I think Post is usually the problematic one) there are issues as well. Custom authorizer itself is another plugin you would have to use to mimic it locally, there is simply too much overhead to do it. From my experience it’s way cheaper to do it on the cloud and remember / invest in automating the automatic removal of non essential envs. (think for example a cron that runs every Friday at 16 and sends you home cause your env can’t be used untill monday :slight_smile: )

Most the cloud providers will let you define stage in some way shape or form.
I would advise doing the following

provider:
  name: aws
  stage: ${opt:stage}

and simple passing the stage into the deploy function like so

sls deploy --stage $STAGE

Then you can easily spawn ephemeral environments for features / developers depending on the need.

Does WEBSOCKET_API: !Ref WebsocketsApi go in the serverless.yml file? If so, under which section/heading?

W/R to testing locally, how do you develop then? With no debugging capability in AWS I would say that all but the most trivial function would be a nightmare to develop blindly. Being able to step through code and inspect the stack each step of the way is imperative and thus the reason the serverless offline capability is a critical feature of the serverless framework.

W/R to stages I generally use four which correspond to local dev, then a dev, stage and prod in AWS itself. I have the stage variable configured as part of a larger project infrastructure that ties in documentation, testing and deployment. This ends up as sls deploy --aws-profile {project-specific-profile}

Oh, and just to circle back to the original problem. The “arn:aws:execute-api:::**/@connections/” arn was definitely causing problems. However, I am all but convinced that AWS Lambda does not support AWS JavaScript 3 SDK.

The following works both locally and in AWS (lots of code removed for brevity)

const AWS = require("aws-sdk");
const endpoint = "https://etc/etc";
const api = new AWS.ApiGatewayManagementApi({
      apiVersion: "2018-11-29",
      endpoint,
    });
return this.api.postToConnection(params).promise();

But this only works locally and fails silently in AWS (as in I can’t get anything to log to CloudWatch)

const { ApiGatewayManagementApi } = require("@aws-sdk/client-apigatewaymanagementapi");
const endpoint = "https://etc/etc";
const api = new ApiGatewayManagementApi({
      apiVersion: "2018-11-29",
      endpoint,
    });
return this.api.postToConnection(params);

I can use the older v2 format but it’s a shame as the monolithic architecture is adding an additional 10 Mb of unused code to my function.

Have you used the newest JS SDK? I have it working for in other Lambdas for DynamoDB.

Stumbled onto this issue and spent the entire day today trying to figure it out… As it turns out, this has been a V3 SDK bug for some time, but there is a workaround.

See here for more details:

Hope this helps!

1 Like

I would say that you’ve hit the nail on the head, this is by far the worst part of serverless developmeent right now. I am not aware on any perfect solutions. I am using xray + have small functions, but I feel your pain. Imagine mine in environemnt where I do use sqs queues / sns queues as triggers and lambdas are merely “step functions” so there is a lambda chain that spawns multiple ones.

Don’t get me wrong, I feel like for simple scenarios local testing is ok, but otherwise is such a cluster-mess to manage that I would rather not do it.

I can use the older v2 format but it’s a shame as the monolithic architecture is adding an additional 10 Mb of unused code to my function.

Have you used the newest JS SDK? I have it working for in other Lambdas for DynamoDB.

Regarding this I am using:

  "version": "2.824.0",

so I can’t really say, but I would try the ref suggestion.
I would also advise not bundling the AWS sdk at all. It’s globally available in AWS functions, myself to limit the bundle sizes I use serverless-bundle - npm and it’s forceExclude option + layers. It optimizes the deployments significantly, but if you’re looking to lower the startup time (cold starts / processing times) it doesn’t seem like it’s doing much.

Does WEBSOCKET_API: !Ref WebsocketsApi go in the serverless.yml file? If so, under which section/heading?

In my service I simply put it in the env of the stack, so that it’s globally available.
like so:

provider:
  ...
  environment:
    WEBSOCKET_API: !Ref WebsocketsApi
    ...

if you take a look at my example I would not have been bit by the mistake discussed in the issue, because I am passing all the changeable parts of the identifier explicitly (stage an region are provided in manner similar to above which I know is redundant but more explicit at the same time)

return new ApiGatewayManagementApi({
    apiVersion: "2018-11-29",
    endpoint: `${process.env.WEBSOCKET_API}.execute-api.${process.env.AWS_REGION}.amazonaws.com/${process.env.STAGE}`,
  });

@tforster lmk if that helped. I am really happy to jump on a call to discuss it & help should you need it :slight_smile:

Addendum : I’ve just found this GitHub - localstack/localstack: 💻 A fully functional local AWS cloud stack. Develop and test your cloud & Serverless apps offline in a serverless newsletter. Haven given it a go yet. But will do soon.