Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dynamic JSON schema #103

Closed
ceckoslab opened this issue Apr 13, 2020 · 25 comments
Closed

Dynamic JSON schema #103

ceckoslab opened this issue Apr 13, 2020 · 25 comments
Assignees

Comments

@ceckoslab
Copy link
Contributor

We are about to finish the initial ticket for static JSON #81 schema and now is the right moment to continue working on this topic and implement dynamic JSON schema.

I think that we can start looking at this class that returns us all needed Metrics, Filters and Decorators: src/BasicRum/CollaboratorsAggregator.php

Let's outline how we could implement this:

  • Contain the needed logic for the dynamic JSON schema in separate class
  • Write bunch of unit tests in order to confirm that our current implementation works. Also this will prepare us for future refactoring because at some point we will reorganize where some files are stored in the files system.
  • I also expect a lot of discussions here because this task is a huge step and it will challenge the current software architecture of Basic RUM.
@korzol
Copy link
Contributor

korzol commented Apr 15, 2020

Hello @ceckoslab
I am currently working on a class - SchemaOrchestrator. This is kind of test schema builder implementation. Once it is done I hope to see what parts it has in common with DiagramOrchestrator and others.
My idea is that then we can decide if we should incorporate schema builder into diagram builder or leave it as a standalone part

@ceckoslab
Copy link
Contributor Author

Hello @korzol

I like the idea for having a schema builder as a standalone part.

I am curious if during this task we will have software design changes. Are you planning to add new methods and do changes in Metrics, Filters and Layout classes? If yes, then could you share some details. Also did you play with Unit Test for the new class SchemaOrchestrator?

Have a nice day.

@korzol
Copy link
Contributor

korzol commented Apr 15, 2020

I'd like not to change anything currently existing. For now. If there will be any changes required I will do it and let you know, so we can discuss it.
As of Unit Tests - not yet, still did not come this far.

Thanks
You too have a nice day

@korzol
Copy link
Contributor

korzol commented Apr 23, 2020

Hello @ceckoslab
Could you please provide some default JSON. Probably we need some default filter in global section as well as some default filter and metric into segment section. What do you think ?

@ceckoslab
Copy link
Contributor Author

Hello @korzol ,

The defaults will be:

  1. Globa/default filter:
period: {
    type: 'moving',
    start: '130',
    end: 'now'
}
  1. Default filter for 1st segment:
device_type:
 - condition: is
 - search_value: 1
  1. Default metric for 1st segment:
technical_metrics - first_paint
data_flavor - histogram

@korzol
Copy link
Contributor

korzol commented Apr 24, 2020

Hello @ceckoslab
Thank you. However there is one problem with provided Global/default filter

Here is a piece of code we have currently in use:

{
                title: 'Desktop OS distribution',
                global: {
                    presentation: {
                        'render_type': 'distribution'
                    },
                    data_requirements: {
                        period: {
                            type: 'moving',
                            start: '30',
                            end: 'now',
                        },
                        filters: {
                            device_type: {
                                search_value: 2,
                                condition: 'is'
                            }
                        }
                    }
                },

As you can see period and filter can exists simultaneously. Probably it's just wrong copy/paste.

Kindly clarify

P.S. Meantime I use the following for default/global filter

filters: {
    device_type: {
        search_value: 2,
            condition: 'is'
        }
}

@ceckoslab
Copy link
Contributor Author

Hello @korzol

I looked at the PR: https://github.com/basicrum/backoffice/pull/115/files

I suggest 2 things:

  1. We stop writing code for 1-2 days and we try to represent in a diagram how the filters, metrics and data flavors are connected. Please try this link for drawing the diagram: https://app.diagrams.net/#G1ndBLyW4H5qYqaVHx3KfZyks4IUGCJAgF

If the link doesn't work please create your own diagram here https://www.diagrams.net/ and share it later.

After we finish with the diagram we check thing number 2.

  1. We don't use JSON directly in the Backend files but we use multi dimensional arrays.

This:

$json = "
            {
                title: 'Diagram Title',
                global: {
                    presentation: {
                        'render_type': 'distribution'
                    },
                    data_requirements: {
                        period: {
                            type: 'moving',
                            start: '30',
                            end: 'now',
                        }
                    }
                },
...

becomes:

$json = [
                "title" => "Diagram Title",
                "global" => [
                    "presentation" [
                        "render_type" =>  "distribution"
                    ],
                    "data_requirements" [
                        "period" => [
                            "type" => "moving"
                            "start" => "30"
                            "end" => "now"
                        ]
                    ]
                ]
...
]

@korzol
Copy link
Contributor

korzol commented Apr 29, 2020

Hello @ceckoslab
Till this moment I already implemented technical metrics to JSON schema. However this is done in simple way - concatenation. Here is a piece of code from src/BasicRum/Metrics/Collaborator.php

public function getDataFlavor($renderType): string
    {
        $dataFlavor = '';
        if ( $renderType == 'time_series' )
        {
            $dataFlavor = '"data_flavor": {
                                "type": "object",
                                "properties": {
                                    "percentile": {
                                        "enum": [50],
                                        "type": "integer"
                                    }
                                }
                            }';
        }

        elseif ( $renderType == 'plane' )
        {
            $dataFlavor = '"data_flavor": {
                                "type": "object",
                                "properties": {
                                    "histogram": {
                                        "type": "object",
                                        "properties": {
                                            "bucket": {
                                                "enum": [200],
                                                "type": integer
                                            }
                                        }
                                    }
                                }
                            }';
        }
        elseif ( $renderType == 'distribution' )
        {
            $dataFlavor = '"data_flavor": {
                                "type": "object",
                                "properties": {
                                    "count": {
                                        "type": "boolean"
                                    }
                                }
                            }';
        }

        return $dataFlavor;
    }

    public function getDataMetrics($renderType): string
    {
        $segmentMetricsPart = '
                            "technical_metrics": {
                                "type": "object",
                                "properties": {
            ';
        foreach ($this->technicalMetricsClassMap as $key => $class)
        {
            $entry = new $class;
            $segmentMetricsPart .= '
                    "'.$key.'": {
                        "type": "object",
                        "properties": {
                            '.$this->getDataFlavor($renderType).'
                        }
                    }';
        }
        $segmentMetricsPart .= "
                }
            }
            ";

        return $segmentMetricsPart;
    }

Right now want to implement similar thing into BusinessMetrics/Collaborator.php

Also I think it would be a good idea to deploy some php class which can convert php array into JSON schema, if there is any. So for example in getDataFlavor function I can simply give it an array and accept a piece of schema. But still not sure if there already exist any or I have to write one new.

Once I done with business metrics I will check

@korzol
Copy link
Contributor

korzol commented Apr 29, 2020

Hello @ceckoslab
Kindly review the latest commit to PR #115
The work is still in progress. And new changes will be added

@korzol
Copy link
Contributor

korzol commented Apr 29, 2020

Kindly check this attached diagram. Did you mean something like this ?

Filters-Metrics

@ceckoslab
Copy link
Contributor Author

Hello @korzol

I looked at the code and my feeling is that we need to discuss things about the software architecture and which logic stays where. I suggest that we do a peer codding session at some point. If possible this Friday ... perhaps you may take take Monday as a day off.

Here are the things that I have specific comments:

  1. I still see a lot of JSON strings concatenation but it will be better to have arrays where we use array merge in order to glue different parts of the JSON schema. When we want tore return the schema to the browser we can do json_encode() - reference: https://www.php.net/manual/en/function.json-encode.php

It will be better we have arrays but not JSON strings concatenation now because later it will be more harder to refactor and maintain JSON strings.

  1. Did you think about unit tests? If not I can support you by showing you from where to start.

  2. The diagram is getting in shape and it's cool but it looks more like a class diagram. Could you think about a diagram that explains the relation between diagram types and data flavors, because for example time series diagram doesn't support data flavor - histogram.

I urge you to stop implementing new code but to do point 1. and to work on point 3. Feel free to explore the internet about publications for dynamic schemas, you can watch again the Vega Lite video ... feel free to do more research.

@korzol
Copy link
Contributor

korzol commented Apr 29, 2020

Hello @ceckoslab
This is just prove of concept. For me it was easier to start with concatenation and finish it in same style. Once it is done and works, now I can rewrite it using arrays. Actually there is/are part(s) which already use arrays.

  1. Sorry I did not understand what kind of diagram did you mean

@korzol
Copy link
Contributor

korzol commented Apr 29, 2020

Also I am not going to implement any new code. I was going to put existing one in order

@korzol
Copy link
Contributor

korzol commented May 1, 2020

Hello @ceckoslab
Here is a diagram which we worked on.
We still have to find out how/if we can make data_flavor to be dependent on global.presentation.render_type. For example:
if render_type is distribution it can't be histogram. etc..

Also global.data_requirements.period's parameters - type, start, end - still have to be added onto diagram
JSON Schema Diagram

@korzol
Copy link
Contributor

korzol commented May 4, 2020

Hello @ceckoslab
Here is latest updates to diagram. It is not done yet, so any thoughts are welcome
Kindly review

JSON Schema Diagram (2)

@korzol
Copy link
Contributor

korzol commented May 5, 2020

Hello @ceckoslab
Just pushed fresh code update.

Kindly review it and let me know.

Meantime going to work on tests

@korzol
Copy link
Contributor

korzol commented May 5, 2020

Hello @ceckoslab
Just pushed tests as well.

Kindly review and let me know your thoughts

@korzol
Copy link
Contributor

korzol commented May 5, 2020

As of diagram from this post.
I am not sure what to do about:

  1. TechnicalMetrics: DownloadTime and LastBlockingResource
    I think DownloadTime can be percentile and histogram
    And not sure about LastBlockingResource. Would be great if we can discuss it

  2. BusinessMetrics - I think they rely on data from RT plugin, right?
    Here as well would be great if we can discuss it more in details

Kindly clarify

@ceckoslab
Copy link
Contributor Author

Hello @korzol

  1. DownloadTime and LastBlockingResource can be a Histogram and Percentile.

  2. The business metrics depend on data from the RT plugin but not directly. For example the Bounce Rate is not calculate because bounce rate metric is send on a beacon but because we check how many pages a client has visited. At one moment we will calculate how long a visitor stays on a page and this again can be done with the help of RT plugin metrics. In conclusion I would say that the RT plugin is used for providing visitor session information.

@korzol
Copy link
Contributor

korzol commented May 11, 2020

Hello @ceckoslab
Just pushed latest updates.
Here is a problem:
I am trying to figure out solution for this thing. As you may noticed I use "definitions" schema feature for Json segments:

segments: {
        1: {
            presentation: {
                name: '',
                color: '#ff6023',
                type: 'bar'
            },
            data_requirements: {
                filters: {
                    device_type: {
                        search_value: 1,
                        condition: 'is'
                    }
                },
                technical_metrics: {
                    first_paint: {
                        data_flavor: {
                            histogram: {
                                bucket: '200'
                            }
                        }
                    }
                }
            }
        }
    }

In the example above I have only one segment - 1. However it is possible to add more segments 2, 3, 4 etc.. accordingly.

And in schema I use $ref feature like this:

"segments": {
        "type": "object",
        "properties": {
            "1": {
                "$ref": "#/definitions/segment"
            }
        }
}

Now, I am not sure how to make schema works for segments 2, 3, 4 etc..
I see 2 possible solutions:

  1. hardcode predefined amount of segments. Lets say 4 segments should be enough
  2. Currently "segments" section is object. If its possible to convert it to array then we can make schema more compact by using the following:
    https://json-schema.org/learn/miscellaneous-examples.html
    See "Arrays of things" section

Let me know, what you think

@korzol
Copy link
Contributor

korzol commented May 13, 2020

Here is latest changes to diagram

JSON Schema Diagram (5)

@ceckoslab
Copy link
Contributor Author

Hello @korzol

I am answering on:

I see 2 possible solutions:

1. hardcode predefined amount of segments. Lets say 4 segments should be enough
2. Currently "segments" section is object. If its possible to convert it to array then we can make schema more compact by using the following:
https://json-schema.org/learn/miscellaneous-examples.html
See "Arrays of things" section
Let me know, what you think

Could you explore option 2? I think that it will be good because the number of segments will vary.

@korzol
Copy link
Contributor

korzol commented May 15, 2020

Hello @ceckoslab
With latest update I've pushed implementation of second option - schema segments object to array.

Ran tests locally and it went fine. Travis as well reported tests went fine

Kindly review and let me know

@korzol
Copy link
Contributor

korzol commented May 15, 2020

I also tested if we can use array instead of object for segments on dashboards diagrams. And they worked well

@korzol
Copy link
Contributor

korzol commented May 18, 2020

Here is latest changes to diagram
JSON Schema Diagram (6)

ceckoslab added a commit to korzol/backoffice that referenced this issue May 22, 2020
ceckoslab added a commit that referenced this issue May 22, 2020
* Added info how to access phpmyadmin locally

* WIP. Basement for dynamic schema implementation

* Bootstrap version bump. Current version doesn't support modal-xl modal dialog class

* Bootstrap version bump. Current version doesn't support modal-xl modal dialog class

* Widget creation modal window redesign

* WIP. Outstanding changes

* Latest updates to dynamic json schema generator

* Concatenation to array

* Move code out of collaborators

* phpunit tests for dynamic schema

* Remove outstanding unneeded function

* Remove DiagramBaseJson.php in favor of default_json.js

* Fix: number of days in schema

* Filters Schema plugins under src/BasicRum/DiagramSchema/

* Preselected diagram type in JSON editor

* Dynamic diagram tests. Switch from json to arrays comparison

* Fix: AbstractFilters

* Add: oneOf to technical and business metrics

* Update dynamic diagram tests reflecting latest diagram changes

* Fix: echo schema

* Migrate segments from object to array

* Migrate segments from object to array #2

* New getDataFlavor function and general cleanup

* Clean unneeded function

* Following the best practices to output Json from controller

* Now technical metrics depend on render_type

* Changes in tests reflecting latest changes in diagram

* Slightly widget controller cleanup

* Revert occasionally submitted file

* Remove outdated bootstrap

Co-authored-by: ceckoslab <ceckoslab@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants