Skip to content

kasipavankumar/sqoop-docker

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Apache Sqoop using Docker. 🐳

A Docker image to play around with Apache Sqoop with Hadoop configured in a Pseudo Distributed Mode (single cluster mode).

Below are the steps to use this image on Play with Docker.

  1. First of all, create an account on Docker Hub.
  2. Login to Play with Docker using the Docker Hub account you just created.
  3. You should see a green "Start" button, click on it to start a session.
  4. Create an instance by clicking on "+ Add new instance" in the left pane, to create a VM.
  5. A new terminal should show up in the right pane. Here, we need to pull the Docker image from Github Container Registry (GHCR). To do so, execute:
docker pull ghcr.io/kasipavankumar/sqoop-docker:latest
  1. After the image has been pulled into the VM, we need to start a new container & switch into it's terminal (mostly bash). To do so, execute:
docker run -it ghcr.io/kasipavankumar/sqoop-docker:latest

At this stage, the image will be booting up by executing all the required for running Sqoop.

From now on, you will be inside container's bash (terminal). 🚀

To verify the working, try the following command:

sqoop import \
    --connect jdbc:mysql://localhost/employees \
    --table employees \
    --username bda \
    --password 123456

This should import all the employees data into Hadoop file system which can be verfied by:

hadoop fs -ls /user/root/employees

which should list around 5 files & using cat on any one of them should show few employees records. 🎉


Deploy Docker image


D. Kasi Pavan Kumar (c) 2021