You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There is a critical need to have a way of providing large-instances of various code hosts for internal testing purposes. These tools should be sourcegraph-wide services and all new features should be going through scale testing to ensure they work for all customers, especially strategic customers.
The context mentioned above is quite clear, but how to get there is very blurry.
For example, there are no defined use cases yet. What each team wants to test at scale is purely based on previous observations, with varying confidence levels. What kind of data is relevant enough is totally unclear. What kind of synthetic data represents accurately enough a customer instance so we can be confident that testing against it will tell us that it'll work with the customer?
➡️ So we're navigating uncharted waters, and we need to provide tooling to make this happen, as early as possible so each team can use their findings to reshape their roadmap toward strategic readiness.
Repo management & IAM
Have a L/XL instance to play with, connected to GitHub, Gitlab, BitBucket, Perforce (sorted by priority)
CodeNav
Need to understand how scaling affects performances: have an instance you can scale up to whichever size you want and connect to any codehost
Search Core
Find out how many concurrent searches we can support
What can 2,000 engineers do? What do they do to an instance?
At this stage, there is no need for any kind of automation because teams can simply take turns in using the scale testing environment.
We are to actively support teammates using the instance and document as we go.
We push for its use. If it's not being used, we seek to understand why.
We build tools that are useful to observe the instance if needed.
We are to pair with them to build tooling to feed data into the instances.
And we maintain the instance, of course.
Boundaries
Single instance only. Or if really needed, we manually duplicate it.
Limited automation for provisioning the instance: populate, snapshot and restore.
Non-goal: replicating customer traffic on the instance.
Non-goal: cost optimization; this is an investment, just don't overspend for no reason.
We maintain, but do not monitor the scale testing instance, that's up to the teams using the instance (but if you see something, tell them eh).
Definition of Done
An engineer familiar with infrastructure and observability can operate and monitor the instance while conducting some tests.
At least two teams got data relevant to strategic readiness out of the scale testing instance.
Instance can be fed with synthetic data
Users can be populated up to 10k < X < 20k
Repositories up to 100k < X < 250k and with the largest one being 16 GB < X < 30 GB
Code hosts: GitHub, GitLab at minimum. Perforce.
We know what the limits of our scale testing instance are. How big can we go?
How much does this cost?
Payout
Each participating team has collected feedback on how their domain operates at scale and knows how to create meaningful data. We can reproduce hypothetical use cases that approximate our customer use case (repos count and size, users count).
The best case scenario is that each team has uncovered potential blockers and has mapped out how to reproduce customers' scenarios. The worst-case scenario is they have simply uncovered bugs that would or have affected customers in the past.
We'll have enough data (cost, usage, reliability) to accurately decide if we want to keep focusing on improving the testing capabilities and/or if we want to focus on automating the tooling itself.
Approach
We previously implemented an MVP, see Scale testing that is currently being tested by Eric Friz and Stephan Hengl. This is our starting point.
➡️ Bet conclusion https://docs.google.com/document/d/1EikVa90v_itxaN5bKSfx59_Dn2q0ElEDhZA6Zp1aiCI/edit#bookmark=id.4thqlb3dusam
There is a critical need to have a way of providing large-instances of various code hosts for internal testing purposes. These tools should be sourcegraph-wide services and all new features should be going through scale testing to ensure they work for all customers, especially strategic customers.
See Codehost Scale Testing for more.
This is an essential pillar to reach strategic readiness. See Toward a confidently Strategic-ready code intelligence platform (post-4.0) 2022-08 for more.
Scratch Log
Problem
The context mentioned above is quite clear, but how to get there is very blurry.
For example, there are no defined use cases yet. What each team wants to test at scale is purely based on previous observations, with varying confidence levels. What kind of data is relevant enough is totally unclear. What kind of synthetic data represents accurately enough a customer instance so we can be confident that testing against it will tell us that it'll work with the customer?
➡️ So we're navigating uncharted waters, and we need to provide tooling to make this happen, as early as possible so each team can use their findings to reshape their roadmap toward strategic readiness.
See Strategic customer support, state of the world for L / XL definitions.
Scope
At this stage, there is no need for any kind of automation because teams can simply take turns in using the scale testing environment.
Boundaries
Definition of Done
Payout
Each participating team has collected feedback on how their domain operates at scale and knows how to create meaningful data. We can reproduce hypothetical use cases that approximate our customer use case (repos count and size, users count).
The best case scenario is that each team has uncovered potential blockers and has mapped out how to reproduce customers' scenarios. The worst-case scenario is they have simply uncovered bugs that would or have affected customers in the past.
We'll have enough data (cost, usage, reliability) to accurately decide if we want to keep focusing on improving the testing capabilities and/or if we want to focus on automating the tooling itself.
Approach
We previously implemented an MVP, see Scale testing that is currently being tested by Eric Friz and Stephan Hengl. This is our starting point.
Tracked issues
@unassigned
Completed
@Piszmog
Completed
@burmudar
Completed
@davejrt
Completed
@jhchabran
#43349)Completed
@kalanchan
Completed
@kopancek: 1.00d
Completed: 1.00d
@miveronese: 1.00d
Completed: 1.00d
@mucles
Completed
@quinnhare
Completed
@sanderginn: 1.00d
Completed: 1.00d
Legend
The text was updated successfully, but these errors were encountered: