Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Daemon: identify based lookup #2051

Draft
wants to merge 14 commits into
base: master
Choose a base branch
from
Draft

Daemon: identify based lookup #2051

wants to merge 14 commits into from

Conversation

kumavis
Copy link
Member

@kumavis kumavis commented Feb 10, 2024

identify takes a petName or namePath and returns a formulaIdentifier

  • lookup can mostly* be built on top of identify.
  • identify can be shortcut for certain formula types, reducing reification to a smaller subgraph

by taking advantage of these two facts, you can potentially reify a smaller subgraph in order to facilitate a lookup.

here is an example setup with a long lookup path

❯ endo purge --force

❯ endo mkhost a                        
Object [Alleged: EndoHost] {}

❯ endo mkhost -a a b          
Object [Alleged: EndoHost] {}

❯ endo mkhost -a a -a b c
Object [Alleged: EndoHost] {}

❯ endo eval -a a -a b -a c 42 -n d     
42

❯ endo restart

❯ endo show a.b.c.d               
42
❯ endo log
Making pet-store
Making pet-store-id512:48ef8b5fbb0ac5c26271b607c96d431b6c8ad842a336c647a47dbb1f68aff9d3ae847ac0e77b422a22920a86f9dd297c20fb1e1aba96f1ee835090e7bc8cecbf
Making pet-store-id512:f44334210cfeb3b94a96e2e0a20397fb3c0133145dbb40f9ddbf78deca5fa787e7992eb12963cb5f783a294308bfb4f76de09656bbd862671114646058b3b530
Making pet-store-id512:ca64a6b2c44ba59cdff4de7384bb48e628265d94761fa3bb8e098f0e1f615e93207279f56f64bd0b9014c85649fbc3d24bf79c439def93a1122cbb98445757ee
Making eval-id512:54b78e2a23c9f0800b2770111d4fee85a8ae47cab6150f310f8bb688dfb04aee107c7e74e31f114844563a7e479d0f52338fe1d459f178390a64ae8febf93c4d
Making worker-id512:ca64a6b2c44ba59cdff4de7384bb48e628265d94761fa3bb8e098f0e1f615e93207279f56f64bd0b9014c85649fbc3d24bf79c439def93a1122cbb98445757ee
Endo worker started PID 670850 unique identifier ca64a6b2c44ba59cdff4de7384bb48e628265d94761fa3bb8e098f0e1f615e93207279f56f64bd0b9014c85649fbc3d24bf79c439def93a1122cbb98445757ee

From this log you can see that in order to perform lookup of a.b.c.d it only had to reify:

  • 0 hosts
  • 4 petstores
  • 1 eval
  • 1 worker

without the hist short-cutting optimization, we get the following log:

❯ endo log         
Making host
Making pet-store
Making worker-id512:0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
Endo worker started PID 671854 unique identifier 0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
Making host-id512:48ef8b5fbb0ac5c26271b607c96d431b6c8ad842a336c647a47dbb1f68aff9d3ae847ac0e77b422a22920a86f9dd297c20fb1e1aba96f1ee835090e7bc8cecbf
Making pet-store-id512:48ef8b5fbb0ac5c26271b607c96d431b6c8ad842a336c647a47dbb1f68aff9d3ae847ac0e77b422a22920a86f9dd297c20fb1e1aba96f1ee835090e7bc8cecbf
Making worker-id512:48ef8b5fbb0ac5c26271b607c96d431b6c8ad842a336c647a47dbb1f68aff9d3ae847ac0e77b422a22920a86f9dd297c20fb1e1aba96f1ee835090e7bc8cecbf
Endo worker started PID 671863 unique identifier 48ef8b5fbb0ac5c26271b607c96d431b6c8ad842a336c647a47dbb1f68aff9d3ae847ac0e77b422a22920a86f9dd297c20fb1e1aba96f1ee835090e7bc8cecbf
Making host-id512:f44334210cfeb3b94a96e2e0a20397fb3c0133145dbb40f9ddbf78deca5fa787e7992eb12963cb5f783a294308bfb4f76de09656bbd862671114646058b3b530
Making pet-store-id512:f44334210cfeb3b94a96e2e0a20397fb3c0133145dbb40f9ddbf78deca5fa787e7992eb12963cb5f783a294308bfb4f76de09656bbd862671114646058b3b530
Making worker-id512:f44334210cfeb3b94a96e2e0a20397fb3c0133145dbb40f9ddbf78deca5fa787e7992eb12963cb5f783a294308bfb4f76de09656bbd862671114646058b3b530
Endo worker started PID 671872 unique identifier f44334210cfeb3b94a96e2e0a20397fb3c0133145dbb40f9ddbf78deca5fa787e7992eb12963cb5f783a294308bfb4f76de09656bbd862671114646058b3b530
Making host-id512:ca64a6b2c44ba59cdff4de7384bb48e628265d94761fa3bb8e098f0e1f615e93207279f56f64bd0b9014c85649fbc3d24bf79c439def93a1122cbb98445757ee
Making pet-store-id512:ca64a6b2c44ba59cdff4de7384bb48e628265d94761fa3bb8e098f0e1f615e93207279f56f64bd0b9014c85649fbc3d24bf79c439def93a1122cbb98445757ee
Making worker-id512:ca64a6b2c44ba59cdff4de7384bb48e628265d94761fa3bb8e098f0e1f615e93207279f56f64bd0b9014c85649fbc3d24bf79c439def93a1122cbb98445757ee
Endo worker started PID 671876 unique identifier ca64a6b2c44ba59cdff4de7384bb48e628265d94761fa3bb8e098f0e1f615e93207279f56f64bd0b9014c85649fbc3d24bf79c439def93a1122cbb98445757ee
Making eval-id512:54b78e2a23c9f0800b2770111d4fee85a8ae47cab6150f310f8bb688dfb04aee107c7e74e31f114844563a7e479d0f52338fe1d459f178390a64ae8febf93c4d

From this log you can see that in order to perform lookup of a.b.c.d it only had to reify:

  • 4 hosts
  • 4 petstores
  • 1 eval
  • 4 workers

layering:

  • identifyLocal in petstore
  • identifyLocal in mail
  • identifyFrom in daemon
  • identify in petstore
  • identify in mail
  • identify in host / guest
  • identify in inspector (with double path part consumption)
  • identify cli command
  • lookup via identify

@rekmarks
Copy link
Contributor

rekmarks commented Feb 11, 2024

There are pet name path utilities in cli/src/pet-name.js and daemon/src/pet-name.js that look like they could be used in a variety of places in this PR.

@kumavis
Copy link
Member Author

kumavis commented Feb 11, 2024

while this approach can yield some potential optimization, the shortcut needs to be implemented per formula type with hosts/guests being the only candidates at present.

It's not clear that this optimization is worth the limitation that lookup values must have a formulaId. This limitation might be avoidable by replacing identify with a
resolve: (namePath: string[]) => Promise<{ formulaId: string|undefined, remainingPath: string }> that identifies as far as it can, and then lookup is run on the remainder.

other design considerations:
this implementation does a single "fully delegated identify with recursion". You could replace this approach with an "iterated identify". fully delegated may have better performance when a long path section is across a network boundary. iterated could have better performance when paths cross many network boundaries (analogy; iterated: series of HTTP redirects, delegated: establishing a tor circuit).

one interesting finding of identify based lookup via direct lookup is the ability for redirects when crossing multiple network boundaries. If an identifier includes a location (eg network address) for the resource, you can establish a direct connection for accessing the resource, where as direct lookup would have to be proxied across each network boundary. though, this should also supposedly be fixed by captp level handoffs.

delegated and resolve both allow multiple path parts to be consumed in one step where possible. this is seen in this implementation for inspector.identify for eval record endowments. For the example path INFO.someEval.endowments.someEndowment, the inspector will consume three parts (petName, propertyName, endowmentName)

imagine a DNS naming hub dns that performs DNS queries and responds with entries. lookup('dns.me.kumavis.dnsToy') would return entries on dnsToy.kumavis.me. The dns naming hub could consume all parts of the path in one step if the identify/resolve/lookup machinery allowed it. You could also implement this for a stepwise resolve but it might lead to a more awkward interface.

in conclusion, it was very interesting to learn the complexity of the design space for something as seemingly simple as path lookups 🤠. I have no idea how common long complicated paths will be, but the insights here may inform other aspects of the system design.

@kumavis kumavis changed the title Daemon: identify Daemon: identify based lookup Feb 11, 2024
@rekmarks rekmarks changed the base branch from endo to master February 15, 2024 23:13
Comment on lines +6 to +14
export const identify = async ({ cancel, cancelled, sockPath, namePath }) =>
withEndoBootstrap({ os, process }, async ({ bootstrap }) => {
const defaultHostFormulaIdentifier = 'host';
const formulaId = await E(bootstrap).identifyFrom(
defaultHostFormulaIdentifier,
namePath,
);
console.log(formulaId);
});
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very close to what I imagined and a step in the right direction.

High level:

  • We need an API and a command that can reveal the formula type. I had envisioned that endo identify would do that, but perhaps that should have a name like describe. I think endo describe should also be able to print the nonce/swissnum/identifier like endo describe --id <name>
  • Whatever API and command reveals the nonce/swissnum/identifier for a pet name path should probably be producing a URL so maybe that should be called endo locate <pet-name-path> and produce the URL by default, including the connection hints for the listening networks. Maybe endo locate --json <path> gives you the JSON representation. Maybe endo locate --id <path> trims that back to just the nonce.
  • Maybe locate and describe are the same command or two commands with the same behavior but different defaults.

Nits:

  • Let’s get rid of cancel, cancelled, and sockPath here and where this refactor-cruft was copied from.
  • host is not a formula identifier anymore. We should always start looking stuff up at E(bootstrap).host().
  • If we move some functions from the endo bootstrap object to host objects, maybe we can just present the primary host facet as the bootstrap object.
  • Maybe we should use process.stdout.write(`${id}\n`) instead of console.log just to break the bad habit of confusing the debugger for an output printer. ymmv.

export const show = async ({ cancel, cancelled, sockPath, namePath }) =>
withEndoBootstrap({ os, process }, async ({ bootstrap }) => {
const defaultHostFormulaIdentifier = 'host';
const pet = await E(bootstrap).lookupFrom(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see. In my head, lookupFrom is equivalent to E(E(host).locate(identifier)).lookup(path) but consolidated. The consolidation might be important since we might want to locate and lookup locally in a single event.

I think we should move the method to Host in order to stay one step away from folding the Endo Bootstrap object into Host objects. They are equally powerful since you can’t get one without the other and there’s little reason to keep them separate.

I’m not attached to the locate(nonce) ~=> identifier function name, and I think I at least temporarily need the name of this method to be explicitly lookupFromFormulaNumber, with goal of reducing that to lookupFromIdentifier when we no longer need to distinguish formula identifier with type from formula number.

const petStore = await provideValueForFormulaIdentifier(
storeFormulaIdentifier,
);
const result = await E(petStore).identify(namePath);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should definitely rename petStore.locate to petStore.identify as this suggests. It should be synchronous, though.

storeFormulaIdentifier = `pet-store-id512:${formulaNumber}`;
}
// eslint-disable-next-line no-use-before-define
const petStore = await provideValueForFormulaIdentifier(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose we can’t make this synchronous!

const origin = await provideValueForFormulaIdentifier(
originFormulaIdentifier,
);
return E(origin).identify(namePath);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given this is async tail recursive and has to at least reïfy pet stores (an unavoidably async operation), I withdraw my hope that identify can even be locally synchronous or transactional.

Given that it can’t be synchronous, there’s no reason to bundle up locate(identifier)~.lookup(path). They can be peanut and butter: two separate methods that are better together.

];

const allowedInspectorFormulaIdentificationsByType = {
eval: ['endowments', 'worker'],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’m not sure that inspectors generalize this well. endowments is an array of identifiers and we want the corresponding values, whereas worker is a single identifier. This does not mention source which is just a string.

In short, I think we should have a boring identify function for each formula type rather than a generalization because it will be more legible and easier to debug.

Comment on lines +775 to +776
const { type: formulaType, number: formulaNumber } =
parseFormulaIdentifier(entryFormulaIdentifier);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By the time we get here, we may have removed the type from the formula identifier, but it can be trivially recovered from the controller object for the number.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When we have a formula for every formula identifier (just the number), it might make sense to just generally store that on the controller too, so we benefit from the memo and avoid redundant reads from the formula store.

@@ -170,7 +171,7 @@ export const makeHostMaker = ({
* @param {string | 'NONE' | 'SELF' | 'ENDO'} partyName
*/
const providePowersFormulaIdentifier = async partyName => {
let guestFormulaIdentifier = lookupFormulaIdentifierForName(partyName);
let guestFormulaIdentifier = identifyLocal(partyName);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like abbreviating lookupFormulaIdentifierForName to identifyLocal generally. That could be one easy-to-review mechanical refactor on its own.

Since the difference between identify and identifyLocal is more like identifyPetNamePath vs identifyPetName respectively, I could buy identify (implied pet name path since it’s public API) and identifyName instead.

Comment on lines +47 to +58
/** @type {import('./types.js').IdentifyFn} */
const identify = async maybeNamePath => {
const namePath = Array.isArray(maybeNamePath)
? maybeNamePath
: [maybeNamePath];
const [headName, ...namePathRest] = namePath;
const formulaIdentifier = identifyLocal(headName);
if (formulaIdentifier === undefined) {
return undefined;
} else {
return identifyFrom(formulaIdentifier, namePathRest);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No notes.

Comment on lines +84 to +95
/** @type {import('./types.js').IdentifyFn} */
const identify = async maybeNamePath => {
const namePath = Array.isArray(maybeNamePath)
? maybeNamePath
: [maybeNamePath];
const [headName, ...namePathRest] = namePath;
const formulaIdentifier = identifyLocal(headName);
if (formulaIdentifier === undefined) {
return undefined;
} else {
return identifyFrom(formulaIdentifier, namePathRest);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don’t think this should exist at this level of abstraction but we should factor out an intermediate layer for name hubs between mail and pet store whenever we introduce directories.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants