Skip to content

MANOJPATRA1991/LOG-ANALYSIS

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

28 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

LOGS ANALYSIS

This project is a part of Part 3 The Backend: Databases & Applications of Udacity Full Stack Nanodegree course.

Table of Contents

  1. Installation
  2. Description
  3. Reference
  4. License

Installation

NOTE:

This project was built on Windows 10 OS. All the interaction with the Virtual Machine was done through Command Prompt on Windows.

(Do not use Git Bash for this project. It simply won't work.)

  1. Python

The source code for this project is written in Python v3.6.1 programming language. For direct download of version 3.6.1 click here.

  1. Code Validators

The source code was checked against bugs and quality using Pylint tool, PEP8 tool and PEP8 online check.

To install Pylint:

pip3 install pylint

To check Python file using Pylint:

pylint fileName.py

To install pep8:

pip3 install pep8

To check Python file using pep8:

pep8 fileName.py
  1. Virtual Box

To run the Virtual Machine, first, we need to download it and then install it. Virtual Box can be downloaded from here.

  1. Vagrant

Vagrant is the software that configures the VM and lets you share files between your host computer and the VM's filesystem. Vagrant can be downloaded from here.

  1. psycopg2 python module is required. To install:
pip3 install psycopg2

Description

This project is an information reporting tool which provides information regarding the most popular articles, the most popular authors and the most logged errors in a day from a news database.

Following are the views that were created as part of the news database:

  1. MostViewedPaths
create view MostViewedPaths as (
select path, count(*) as views
from log
where path like '/_%'
group by path
order by views desc);
  1. articleShortInfo
create view articleShortInfo as (
select a.title, c.name, b.views
from articles as a join MostViewedPaths as b
on concat('/article/', a.slug) = b.path
join authors as c
on a.author = c.id
order by views desc);
  1. LogRequests
create view LogRequests as (
select time::timestamp::date, count(*) as total
from log
group by time::timestamp::date
order by total desc);
  1. ErrorRequests
create view ErrorRequests as (
select time::timestamp::date, count(*) as errors
from log
where status like '4%' or status like '5%'
group by time::timestamp::date
order by errors desc);

The newsdb.py file contains the implementations for the three functions:

  1. get_most_popular_articles()
  2. get_most_popular_authors()
  3. get_most_logged_errors()

In each of the these functions, a new Connection object and Cursor object are created as:

DBNAME = "news"
conn = psycopg2.connect(database=DBNAME)
cursor = conn.cursor()

Then, the query is run using the cursor.execute() function. The data is fetched using cursor.fetchall() and stored in a local variable which is returned by the respective functions.

Finally, the connection is closed using conn.close() within each function.

Executing the program
  1. Copy the python files to the folder which contains the newsdata.sql file.
  2. If the folder doesn't already contain a Vagrantfile. Run the following command to create one.
vagrant init
  1. To start the virtual machine, from your local directory, run the following command:
vagrant up
  1. Then to drop a full-fledged SSH session, run the following command:
vagrant ssh
  1. Type psql to switch to the interactive terminal for working with PostgreSQL.
  2. Now create a new database(if it doesn't exist already):
create database news;
  1. Then exit with Ctrl + D.

  2. Run psql -d news -f newsdata.sql to create the tables authors, articles and log. This will exit the psql terminal.

  3. Start the psql terminal with psql and move into the news database with \c news.

  4. Now open up another Command Prompt. Move to the project directory. Run vagrant ssh to move into the VM.

  5. Run the newsdata.py file with the following command after moving into the file's location to get the output:

python newsdata.py

To know how the output looks check here.

Reference

  1. Python Documentation
  2. Google Python Style Guide
  3. PEP8
  4. Vagrant
  5. Oracle VM Virtual Box
  6. PostgreSQL

License

The content of this repository is licensed under MIT.