Skip to content

Load Testing & Profiling

Ryan Mahoney edited this page Sep 19, 2018 · 61 revisions

Load testing is the process of putting demand on a system and measuring its response. Wikipedia

In software engineering, profiling ("program profiling", "software profiling") is a form of dynamic program analysis that measures, for example, the space (memory) or time complexity of a program, the usage of particular instructions, or the frequency and duration of function calls. Wikipedia

What will you find in this section?

On June 2018 we load tested, profiled, and generated a flame graph for the alert matching logic. At that time it was believed that we might have to do this many more times in the future. So we decided to document the process we followed, and hence this wiki page. For more information on that please see the asana ticket for that task.

More specifically, this section includes our notes on how to load test, profile, and generate flame graphs that might help in identifying "matching" performance improvement opportunities and estimate how long it might take to do alert matching against a given large number of users/subscriptions.

tl;dr - These notes should be useful, as a reference, if you'd like to load test the matching logic, profile, and generate flame graphs.

Load Testing Alert Matching

Matching Logic

As of this writing, the matching logic is triggered within SubscriptionFilterEngine.schedule_all_notifications/1. This function accepts a list of alerts and schedules notifications (emails or SMS) for all users who should receive one. Also, notice that "matching time" is logged within this function. This is the data point we care about when working on optimizing what we refer to as "alert matching time".

Setup

Consider the following setup steps when preparing the project and DB for load testing. Keep in mind that freezing the alerts feed and current datetime, as described below, will manipulate the subscriptions that flow through the matching pipeline, so consider the changes you'll be testing before you decide if it makes sense to do so.

1. Freeze alerts feed (from prod)

Get a copy of the alerts feed (from production). Start up a server that returns the static copy of the production alerts feed. You could try something similar to:

  • mkdir static_alerts_feed
  • cd static_alerts_feed
  • curl -0 http://s3.amazonaws.com/mbta-realtime-prod/alerts_enhanced.json -o index.html
  • python -m SimpleHTTPServer 8000
  • set ALERT_API_URL environment variable to http://localhost:8000

2. Freeze datetime

Edit the default argument in SubscriptionFilterEngine.determine_recipients/4 with the datetime you'd like to use. For example:

DateTime.from_naive!(~N[2018-04-02 08:00:00], "Etc/UTC")

3. Create X number of users (with trips and subscriptions)

Copy production's users, trips and subscriptions DB tables to your local development DB. Run the following to dump production and sanitize the results of any private data:

PGPASSWORD=<PROD PASSWORD> pg_dump --host=alerts-concierge-prod.cw84s0ixvuei.us-east-1.rds.amazonaws.com --port=5432 --username=alerts_concierge --dbname=alerts_concierge_prod --table=users --table=trips --table=subscriptions --no-owner --data-only | elixir ./scripts/sanitize_db_dump.exs | psql -d alert_concierge_dev

To grow your users, trips, and subscriptions tables, use the user_multiplier elixir script. For example, to create 1,000 new users you can run:

mix run ./scripts/user_multiplier.exs --count 1000

This will create 1,000 new users, with their trips and subscriptions, by selecting existing-random users from the DB and duplicating them.

Measuring Matching Time

There is a helper script you can use to run a server and save parsing and matching times to a CSV file. Simply run it like:

./scripts/timing-server

The timing measurements will be saved as type,time columns to the file parsing_matching_times.csv.

Profiling

fprof

To profile a given expression using Erlang’s fprof tool, use the fprof mix task.

Example usage:

First, add 100 users to the DB (or your preferred number of users) with trips and subscriptions.

Then, run the profile.fprof mix task to profile an expression.

env `cat .env` mix profile.fprof -e AlertProcessor.AlertParser.process_alerts --callers --details --sort own > fprof_results.txt

For more information on the fprof mix task: https://hexdocs.pm/mix/Mix.Tasks.Profile.Fprof.html

eflame

Use eflame to generate flame graphs.

Example usage:

First add 100 users to the DB (or your preferred number of users) with trips and subscriptions.

In iex:

iex -S mix
:eflame.apply(:normal, "stacks.out", AlertProcessor.AlertParser, :process_alerts, [])

In terminal:


./deps/eflame/stack_to_flame.sh < stacks.out > flame.svg

You might have to increase the timeout in eflame's stop_trace/2. There's an issue and PR (https://github.com/proger/eflame/issues/13) to add support for this but in the meantime we're just going to have it to do it manually, here it is: https://github.com/proger/eflame/blob/master/src/eflame.erl#L52

For more information about eflame: https://github.com/proger/eflame