Skip to content

Commit

Permalink
#1205 update archetype in README (#1206)
Browse files Browse the repository at this point in the history
* #1205 update archtype in readme

* update README for installed archetype

* Minor adjustsments

---------

Co-authored-by: Richard Zowalla <rzo1@apache.org>
  • Loading branch information
joshfischer1108 and rzo1 committed May 6, 2024
1 parent 96b8c32 commit 0315f95
Showing 1 changed file with 15 additions and 3 deletions.
18 changes: 15 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,11 +14,18 @@ NOTE: These instructions assume that you have [Apache Maven](https://maven.apach

StormCrawler requires Java 11 or above.

DigitalPebble's [Ansible-Storm](https://github.com/DigitalPebble/ansible-storm) repository contains resources to install Apache Storm using Ansible. Alternatively, this [stormCrawler-docker](https://github.com/DigitalPebble/stormcrawler-docker) project should help you run Apache Storm on Docker.
DigitalPebble's [Ansible-Storm](https://github.com/DigitalPebble/ansible-storm) repository contains resources to install Apache Storm using Ansible. Alternatively, this [stormcrawler-docker](https://github.com/DigitalPebble/stormcrawler-docker) project should help you run Apache Storm on Docker.

Once Storm is installed, the easiest way to get started is to generate a brand new StormCrawler project using \:
Once Storm is installed, the easiest way to get started is to generate a new StormCrawler project following the instructions below:

`mvn archetype:generate -DarchetypeGroupId=org.apache.stormcrawler -DarchetypeArtifactId=stormcrawler-archetype -DarchetypeVersion=3.0`
### First, build the Stormcrawler codebase
```shell
mvn install
```
### Then, generate a project using the locally installed archetype
```shell
mvn archetype:generate -DarchetypeGroupId=org.apache.stormcrawler -DarchetypeArtifactId=stormcrawler-archetype -DarchetypeVersion=3.0-SNAPSHOT
```

You'll be asked to enter a groupId (e.g. com.mycompany.crawler), an artefactId (e.g. stormcrawler), a version, a package name and details about the user agent to use.

Expand All @@ -28,6 +35,11 @@ Alternatively if you can't or don't want to use the Maven archetype above, you c

Have a look at the code of the [CrawlTopology class](https://github.com/apache/incubator-stormcrawler/blob/master/archetype/src/main/resources/archetype-resources/src/main/java/CrawlTopology.java), the [crawler-conf.yaml](https://github.com/apache/incubator-stormcrawler/blob/master/archetype/src/main/resources/archetype-resources/crawler-conf.yaml) file as well as the files in [src/main/resources/](https://github.com/apache/incubator-stormcrawler/tree/master/archetype/src/main/resources/archetype-resources/src/main/resources), they are all that is needed to run a crawl topology : all the other components come from the core module.

#### Archetype Notes

While you will always be able to build StormCrawler from source we are working towards getting our first release out under the Apache Software Foundation.
Once this happens, generating StormCrawler projects will not require you to install the Maven archetype from source.

## Getting help

The [WIKI](https://github.com/apache/incubator-stormcrawler/wiki) is a good place to start your investigations but if you are stuck please use the tag [stormcrawler](http://stackoverflow.com/questions/tagged/stormcrawler) on StackOverflow or ask a question in the [discussions](https://github.com/apache/incubator-stormcrawler/discussions) section.
Expand Down

0 comments on commit 0315f95

Please sign in to comment.