Skip to content

Commit

Permalink
Merge pull request #10 from OAI/develop
Browse files Browse the repository at this point in the history
Implement merge process as described in issue #1
  • Loading branch information
SensibleWood committed Feb 25, 2022
2 parents 15ec8aa + 401ee5b commit 5508254
Show file tree
Hide file tree
Showing 23 changed files with 12,370 additions and 6,014 deletions.
38 changes: 32 additions & 6 deletions README.md
Expand Up @@ -2,7 +2,7 @@

This project is provided by the OpenAPI Initiative as a means to centralize ecosystem information on OpenAPI-related tooling. It leverages open source projects that have gone before to provide a consolidated list of tooling.

This project Kanban board for Tooling can be found here: https://github.com/OAI/Projects/projects/4
The project Kanban board for Tooling can be found here: https://github.com/OAI/Projects/projects/4

## Roll Call

Expand All @@ -27,17 +27,43 @@ Each is expanded upon in the sections below.

The tooling list is built in largely the same format as the majority of projects that have blazed a trail in tooling before (which of course this project takes full advantage of).

In order to bring this together in a sensible way a Gulp-based process has been implemented. Gulp was chosen given the relative ease with which functions can be implemented to massage the data stream and to ensure the build is not closely-coupled to a CI tool.
In order to bring this together in a sensible way a Gulp-based process has been implemented. Gulp was chosen given the relative ease with which functions can be implemented to massage the data stream and to ensure the build is not closely-coupled to a (commercial) CI tool. The functions themselves are abstracted away from Gulp to enable the build to "lift-and-shift" to a new build tool as required.

Currently only the initial build is implemented. The steps are as follows:
### Full Build

* Retrieve each tooling source.
The full build takes the following approach:

* Retrieve each tooling source, including the existing list at [docs/tools.yaml](docs/tools.yaml).
* Combine source data based on repository name.
* Normalise property names across sources using simple statistics (Sørensen–Dice, Damerau–Levenshtein distance).
* Get repository metadata from GitHub.
* Write to [docs/tools.yaml](docs/tools.yaml)
* Write to [docs/tools.yaml](docs/tools.yaml).

Currently this build is not scheduled and is [in the backlog](https://github.com/OAI/Tooling/issues/9) to implement.

To run the full build:

```bash
yarn install
gulp full
```

### Metadata Update

The goal of the metadata update is to provide consistent repository metadata without sourcing new tooling:

* Read the existing list at [docs/tools.yaml](docs/tools.yaml).
* Get repository metadata from GitHub.
* Write to [docs/tools.yaml](docs/tools.yaml).

Currently this build is not scheduled and is [in the backlog](https://github.com/OAI/Tooling/issues/9) to implement.

To run the metadata build:

This process will be amended to include a merge process to capture updates to sources and merge into the existing dataset.
```bash
yarn install # If you haven't done this already
gulp metadata
```

### Website

Expand Down
17,003 changes: 11,411 additions & 5,592 deletions docs/tools.yaml

Large diffs are not rendered by default.

167 changes: 16 additions & 151 deletions gulpfile.js/index.js
Expand Up @@ -2,163 +2,28 @@
const { src, dest } = require('gulp');
const transform = require('gulp-transform');
const rename = require('gulp-rename');
const log = require('fancy-log');
const YAML = require('js-yaml');

const { getGithubRepositoryMetadata, normalisePropertyNames, normaliseSplitters } = require('./lib');
const {
getRepositoryMetadata,
readSourceData,
mergeSources,
normaliseSources,
} = require('../lib');

/**
* Read all sources of data and return array of objects containing results
*
* @param {string} rawConfig The raw configuration file found in this directory
* @returns {Object[]} Array of objects returned by each processor
*/
const readSourceData = async (rawConfig) => {
log('readSourceData');

const config = JSON.parse(rawConfig);

const results = await config
.reduce(async (output, source) => {
const update = await output;
log(source.title, 'Reading source data...');

// Yes, this is an anti-pattern and opinionated approach...
// but it provides a nice level of flexibility in the build mechanism
// so lets live with it for now...

// eslint-disable-next-line import/no-dynamic-require, global-require
const processor = require(`${__dirname}/${source.processor}`);
const data = await processor(source.title, source.url);

return update.concat(data);
}, []);

// A string needs to returned by the async operation
return JSON.stringify(results);
};

/**
* Merge all sources, combining entries where duplicates are found
*
* @param {Object[]} sources
* @returns {string} Merged sources
*/
const mergeSources = async (rawSources) => {
log('mergeSources');
const sources = JSON.parse(rawSources);

// Get properties across all sources
const sourceProperties = sources
.reduce((output, source) => Object.assign(
output,
Object.keys(source)
.reduce((thisOutput, key) => Object
.assign(thisOutput, { [key]: (output[key] || 0) + 1 }), {}),
{},
), {});

// This of course removes some of the flexibility we get from the processor approach
// Need to devise a way to discover the uri rather than using hard-coded values
const mergedSources = sources
.reduce((output, source) => {
const uri = (source.github || source.link || source.homepage).toLowerCase();

return Object.assign(output, { [uri]: (output[uri] || []).concat(source) });
}, {});

return JSON.stringify({ sourceProperties, mergedSources });
};

/**
* Normalise and classify the data consistently across all sources
*
* @param {Object} rawSources Source data keyed on URL
* @returns {string} YAML-encoded string of tools with merged properties
*/
const normaliseSources = async (rawSources) => {
log('normaliseSources');

const keyMappings = {
github: 'repository',
gitlab: 'repository',
bitbucket: 'repository',
description: 'source_description',
};
const { sourceProperties, mergedSources } = JSON.parse(rawSources);
const normalisedProperties = normalisePropertyNames(sourceProperties);
const functionMappings = {
language: normaliseSplitters,
};

// There are easier ways of doing this by using multiple reassignments
// but this is better for readability/general understanding of what is going on
const normalisedSources = Object.values(mergedSources)
.reduce((output, tool) => output.concat(tool
.reduce((toolOutput, source) => Object.assign(
toolOutput,
Object.entries(source)
.reduce((sourceOutput, [key, value]) => {
const targetPropertyName = normalisedProperties[key];

return Object.assign(
sourceOutput,
{
[targetPropertyName]: toolOutput[targetPropertyName]
? [].concat(toolOutput[targetPropertyName], value) : value,
},
);
}, {}),
), {})), [])
.map((tool) => Object.entries(tool)
.reduce((output, [key, value]) => {
// If source data is an array then use Set to dedup the values
const metadataValue = Array.isArray(value) ? [...new Set(value)] : value;
const outputKey = [keyMappings[key] || key];
const outputValue = functionMappings[outputKey]
? functionMappings[outputKey](metadataValue) : metadataValue;

return Object.assign(
output,
{
[keyMappings[key] || key]: Array.isArray(outputValue) && outputValue.length === 1
? outputValue.pop() : outputValue,
},
);
}, {}));

return JSON.stringify(normalisedSources);
};

const getRepositoryMetadata = async (rawNormalisedData) => {
log('getRepositoryMetadata');

const normalisedData = JSON.parse(rawNormalisedData);
const enrichedData = await Promise.all(normalisedData
.map(async (source) => {
if (source.repository && source.repository.match(/github\.com/)) {
return {
...source,
...(await getGithubRepositoryMetadata(
source.repository,
process.env.GITHUB_USER,
process.env.GITHUB_TOKEN,
)),
};
}

return source;
}));

return YAML.dump(enrichedData);
};

const buildFromSource = () => src('gulpfile.js/sourceMetadata.json')
const full = () => src('gulpfile.js/metadata.json')
.pipe(transform('utf8', readSourceData))
.pipe(rename('raw-sources.yaml')) // Write raw data for debug purposes
.pipe(dest('build/'))
.pipe(transform('utf8', mergeSources))
.pipe(transform('utf8', normaliseSources))
.pipe(transform('utf8', getRepositoryMetadata))
.pipe(rename('tools.yaml'))
.pipe(dest('docs/'));

exports.default = buildFromSource;
const metadata = () => src('docs/tools.yaml')
.pipe(transform('utf8', getRepositoryMetadata))
.pipe(rename('tools.yaml'))
.pipe(dest('docs/'));

exports.full = full;
exports.metadata = metadata;
144 changes: 0 additions & 144 deletions gulpfile.js/lib/index.js

This file was deleted.

1 change: 0 additions & 1 deletion gulpfile.js/lib/processors/default-processor.js

This file was deleted.

8 changes: 0 additions & 8 deletions gulpfile.js/lib/processors/openapi-tools-processor.js

This file was deleted.

0 comments on commit 5508254

Please sign in to comment.