Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using Spline in Spark Scala project #785

Open
zacayd opened this issue Feb 5, 2024 · 1 comment
Open

Using Spline in Spark Scala project #785

zacayd opened this issue Feb 5, 2024 · 1 comment

Comments

@zacayd
Copy link

zacayd commented Feb 5, 2024

Hi @wajda

I have a spark code that I got from someone in the organization that has Scala Code
They have a Configuration.conf like this
If I add in the config file:

  spline {
    lineageDispatcher = "http"
    lineageDispatcher.http.producer.url = http://localhost:9090/producer
  }
  
  // Spark configurations
  sql {
    queryExecutionListeners = "za.co.absa.spline.harvester.listener.SplineQueryExecutionListener"
  }

And in the intiallation of Spark I put

pyspark \
  --packages za.co.absa.spline.agent.spark:spark-2.4-spline-agent-bundle_2.12:<VERSION> \
  --conf "spark.sql.queryExecutionListeners=za.co.absa.spline.harvester.listener.SplineQueryExecutionListener" \
  --conf "spark.spline.lineageDispatcher.http.producer.url=http://localhost:9090/producer"

Here is the config:

where do in need to put the Jar

dt="20240120"
output = "s3://aiser-tests/example_project_output"
baseAccumulatorsPath = "s3://aiser-tests/accumulators"
numPartitions = 5864

requestLogTable = "dl_fact.fact_request_p"
impressionLogTable = "dl_fact.fact_impression_p"
clicksLogTable = "dl_fact.fact_click_p"
winLogTable = "dl_fact.fact_win_p"
dpmTable = "dl_udms_work.dim_playback_methods_udms2"
dpsTable = "dl_ingested_data_dim.dim_player_sizes_udms2"
dimAdUnits = "dl_ingested_data_dim.dim_ad_units"
rcTable = "dl_ingested_data_dim.dim_rate_card_lines_extended"
rceTable = "dl_ingested_data_dim.dim_rate_card_lines_exceptions_extended"
eventsTable = "dl_fact.fact_event_p"
sfEventsTable = "dl_fact.fact_sf_events_p"
sfTimeTable = "dl_fact.fact_sf_times_p"

spark{
  master = "local[4]"
  appName = "data-core-agg-request-funnel"
  serializer = ""
  classesToRegister = ""

}
@wajda
Copy link
Contributor

wajda commented Feb 6, 2024

I don't understand the question. Can you clarify it please?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: New
Development

No branches or pull requests

2 participants