Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Print table schema in DB initialization test #5248

Merged
merged 14 commits into from Jan 11, 2022
Merged
Show file tree
Hide file tree
Changes from 10 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
7 changes: 4 additions & 3 deletions .github/workflows/master.yml
Expand Up @@ -168,10 +168,11 @@ jobs:
# Run tests
cd tests/db
docker-compose build
docker-compose run mlflow-postgres python log.py
docker-compose run mlflow-mysql python log.py
docker-compose run mlflow-sqlite python run_checks.py --schema-output schemas/sqlite.sql
docker-compose run mlflow-postgres python run_checks.py --schema-output schemas/postgres.sql
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dump the table schema to make it easier to debug locally.

docker-compose run mlflow-mysql python run_checks.py --schema-output schemas/mysql.sql
docker-compose run mlflow-mssql ./init-mssql-db.sh
docker-compose run mlflow-mssql python log.py
docker-compose run mlflow-mssql python run_checks.py --schema-output schemas/mssql.sql

# Clean up
docker-compose down --rmi all --volumes
Expand Down
1 change: 0 additions & 1 deletion tests/db/.dockerignore
@@ -1,6 +1,5 @@
**

!dist/*.whl
!log.py
!init-mssql-db.sh
!init-mssql-db.sql
2 changes: 2 additions & 0 deletions tests/db/.gitignore
@@ -0,0 +1,2 @@
schemas
mlflowdb
2 changes: 0 additions & 2 deletions tests/db/Dockerfile
Expand Up @@ -7,5 +7,3 @@ COPY dist ./dist
RUN pip install dist/*.whl
RUN pip install psycopg2 pymysql mysqlclient
RUN pip list

COPY log.py .
6 changes: 0 additions & 6 deletions tests/db/Dockerfile.mssql
Expand Up @@ -19,9 +19,3 @@ RUN apt-get update && ACCEPT_EULA=Y apt-get install -y mssql-tools unixodbc-dev
RUN pip install dist/*.whl
RUN pip install pyodbc
RUN pip list

COPY log.py .
COPY init-mssql-db.sh .
COPY init-mssql-db.sql .

RUN chmod +x init-mssql-db.sh
Comment on lines -22 to -27
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We no longer need these line since we mount tests/db when running the container.

16 changes: 16 additions & 0 deletions tests/db/docker-compose.yml
Expand Up @@ -13,6 +13,8 @@ services:
- postgres
build:
context: .
volumes:
- .:/tmp/mlflow
environment:
MLFLOW_TRACKING_URI: postgresql://mlflowuser:mlflowpassword@postgres:5432/mlflowdb

Expand All @@ -30,6 +32,8 @@ services:
- mysql
build:
context: .
volumes:
- .:/tmp/mlflow
Comment on lines +35 to +36
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use volumes for faster development cycle instead of copying scripts in the docker image.

environment:
MLFLOW_TRACKING_URI: mysql://mlflowuser:mlflowpassword@mysql:3306/mlflowdb

Expand All @@ -46,5 +50,17 @@ services:
build:
context: .
dockerfile: Dockerfile.mssql
volumes:
- .:/tmp/mlflow
environment:
MLFLOW_TRACKING_URI: mssql+pyodbc://mlflowuser:Mlfl*wpassword1@mssql/mlflowdb?driver=ODBC+Driver+17+for+SQL+Server

mlflow-sqlite:
depends_on:
- postgres
build:
context: .
volumes:
- .:/tmp/mlflow
environment:
MLFLOW_TRACKING_URI: sqlite:////tmp/mlflow/mlflowdb
Empty file modified tests/db/init-mssql-db.sh 100644 → 100755
Empty file.
Empty file modified tests/db/init-mssql-db.sql 100644 → 100755
Empty file.
27 changes: 0 additions & 27 deletions tests/db/log.py

This file was deleted.

67 changes: 67 additions & 0 deletions tests/db/run_checks.py
@@ -0,0 +1,67 @@
import os
import argparse

import sqlalchemy
from sqlalchemy.schema import MetaData, CreateTable

import mlflow
from mlflow.tracking._tracking_service.utils import _TRACKING_URI_ENV_VAR


class MockModel(mlflow.pyfunc.PythonModel):
def load_context(self, context):
pass

def predict(self, context, model_input):
pass


def parse_args():
parser = argparse.ArgumentParser()
parser.add_argument("--schema-output", required=True, help="Output path of DB schema")
return parser.parse_args()


def run_logging_operations():
with mlflow.start_run() as run:
print("Tracking URI:", mlflow.get_tracking_uri())
mlflow.log_param("p", "param")
mlflow.log_metric("m", 1.0)
mlflow.set_tag("t", "tag")
mlflow.pyfunc.log_model(
artifact_path="model",
python_model=MockModel(),
registered_model_name="mock",
)
print(mlflow.get_run(run.info.run_id))


def get_db_schema():
engine = sqlalchemy.create_engine(mlflow.get_tracking_uri())
created_tables_metadata = MetaData(bind=engine)
created_tables_metadata.reflect()
# Write out table schema as described in
# https://docs.sqlalchemy.org/en/13/faq/metadata_schema.html#how-can-i-get-the-create-table-drop-table-output-as-a-string
lines = []
for ti in created_tables_metadata.sorted_tables:
lines += list(map(str.rstrip, str(CreateTable(ti)).splitlines()))
return "\n".join(lines)


def main():
assert _TRACKING_URI_ENV_VAR in os.environ

run_logging_operations()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this line for triggering table creation ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep!

Copy link
Member Author

@harupy harupy Jan 11, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+ for making sure we can run logging operations successfully.

schema = get_db_schema()
title = "Schema"
print("=" * 10, title, "=" * 10)
print(schema)
print("=" * (20 + 2 + len(title)))
args = parse_args()
os.makedirs(os.path.dirname(args.schema_output), exist_ok=True)
with open(args.schema_output, "w") as f:
f.write(schema)


if __name__ == "__main__":
main()