Skip to content

Backend: Metabase Malli Cheatsheet

bryan edited this page May 6, 2024 · 7 revisions

Links

Summary link level
https://malli.io Malli Playground basic
https://github.com/metosin/malli Malli README basic
https://github.com/metosin/malli/blob/master/docs/function-schemas.md Function Schemas Readme intermediate
https://github.com/metosin/malli/blob/master/docs/tips.md Tips Readme (good stuff in here) advanced
https://github.com/metosin/malli/tree/master/docs Other Docs

Malli Docs v2024-05

Malli is a data validation and specification library for Clojure(Script).

You can do a lot with Malli, but here are some of the main things:

  • Define schemas for your data with hiccupy Clojure data-structures
  • Validate data against those schemas with mc/validate
  • Explain why data is invalid with mc/explain
  • Generate random data that conforms to those schemas with mg/generate
    • Create a test check generator with mg/generator
  • Transform data into a canonical form with malli.transform
  • Extend Malli with custom schema types in metabase.util.malli.schema and elsewhere
  • Describe schemas in a human-readable way with umd/describe

Handy Requires

I have this snippet, and when it's hands-on time with Malli, it helps a lot:

#_:clj-kondo/ignore ;;nocommit
(require '[malli.core :as mc] '[malli.error :as me] '[malli.util :as mut] '[metabase.util.malli :as mu]
         '[metabase.util.malli.describe :as umd] '[malli.provider :as mp] '[malli.generator :as mg]
         '[malli.transform :as mtx])

Schemas

Schemas are the core of Malli. They define the structure of your data. Malli provides a number of built-in schema types, and you can also define your own custom schema types. Here are some examples of built-in schema types:

  • :int - an integer
  • :string - a string
  • :map - a map
  • :tuple - a tuple
  • :enum - an enumeration
  • :and - a combination of multiple schemas
  • :or - a choice between multiple schemas
  • :multi - a schema that can be one of multiple types
  • :inst - a timestamp
  • :email - an email address
  • :url - a URL
  • :keyword - a keyword
  • :boolean - a boolean
  • :nil - nil
  • :coll - a collection
  • :vector - a vector
  • :set - a set
  • :sequential - a seq of something
  • :tuple - a fixed-length, heterogeneous collection

We have a few custom schema types in Metabase. They're mostly vanilla schemas annotated with error messages to be used by our API layer, but we can put any data we want in them. Here are some examples of custom schema types:

(mc/validate ms/BooleanValue false)
;; => true

(mc/validate ms/BooleanValue 2)
;; => false

(mc/validate ms/PositiveInt -1)
;; => false

Validation

Malli provides a number of functions for validating data against schemas. Here are some examples:

  • malli.core/validate - validate data against a schema
(mc/validate :int 1)
;; => true

;; assert will throw, or return the value:

(mc/assert :int 1)
;; => 1

(try (mc/assert :int "not an int")
     (catch Exception e (ex-data e)))

Also prints:

-- Schema Error ------------------------------------------- NO_SOURCE_FILE:87 --

Value:

  "not an int"

Errors:

  ["should be an integer"]

Schema:

  :int

More information:

  https://cljdoc.org/d/metosin/malli/CURRENT

--------------------------------------------------------------------------------
(mc/assert ms/PositiveInt 1)

Generation

Malli provides a number of functions for generating random data that conforms to schemas. It's great to use when building a schema, because you can have it generate random data and see that it conforms to the schema in your head. Here are some examples:

Generating scalars:

(mg/generate :int)
;; => 34

(mg/generate :boolean)
;; => false

(mg/generate [:or :int :string])
;; => -1713124

(mg/generate [:enum :left :right])
;; => :left

Generating maps:

(mg/generate [:map [:a :int] [:b :string]])
;; => {:a -1, :b "aEKUgBqXop"}

Generating sequences:

(mg/generate [:sequential :int])
;; => [-44971 -49451 -50 -444185161 -1 -298 -2 133027287 -319 -1 340575216 58 -33 -12 -267328666 130404 -52261 -330386
;;     -29770 -241298 -3903979 12498718 213279 -9636714 -1 216 -1]
;; => [-200996630]
;; => [15509387 -19611096 -12164656 42892 476216 2536 3514 194075784 -119 395 5460693 -15 2983704 1410 -2617 -39274550]

(mg/generate [:sequential [:enum :left :right]])
;; => [:left :left :left :right :left]

Generating tuples:

(mg/generate [:tuple :int :string :boolean])
;; => [-1 "4K3fnHAFn5xQ4YV" true]
;; => [-81255 "3k5W65yXc82vCz6j62xp7l" false]

Using a seed

Up until now, your repl output wouldn't match mine. we can change that by using :seed.

(mg/generate :int {:seed 1})
;; => 909

Using a size

Size can be used to control the "complexity" of the generated data. It's useful for generating simple or more complicated examples.

(count (mg/generate [:sequential :int] {:seed 1 :size 2}))
;; => 2

(count (mg/generate [:sequential :int] {:seed 10 :size 20000}))
;; => 2619

Human friendly descriptions

(umd/describe [:sequential :int])
;; => "sequence of integer"

(umd/describe [:sequential [:map [:x [:sequential :int]]]])
;; => "sequence of map where {:x -> <sequence of integer>}"

Intermediate Generators: generating a permissions graph

The following code generates model id or ids, that match what is in the database. They have not been used yet, but they are here for when we need them.

(require '[toucan2.core :as t2])

(set! *warn-on-reflection* true)

(defn- rand-pk-for-model
  ([model]
   (rand-nth (t2/select-pks-vec model)))
  ([model & {:keys [seed]}]
   (let [pks (t2/select-pks-vec model)]
     (prn pks)
     (nth pks (rem (Math/abs ^long seed) (count pks))))))

(defn default-id-for-model [model]
  ;; output mapped from input:
  [:int {:gen/fmap (fn [n] (#'rand-pk-for-model model :seed n))}])

(def ^:private db-id (default-id-for-model :model/Database))
(def ^:private user-id (default-id-for-model :model/User))
(def ^:private group-id (default-id-for-model :model/PermissionsGroup))

(defn- default-ids-for-model [model]
  [:set
   {:gen/fmap
    ;; Notice: we use the size of the input to determine the size of the output.
    ;; This makes shrinking with test.check work way better.
    (fn [in]
      (loop [n (count in) acc #{}]
        (if (zero? n)
          acc
          (recur
           (dec n)
           (conj acc (#'rand-pk-for-model model))))))}
   :int])

(def ^:private db-ids (default-ids-for-model :model/Database))
(def ^:private user-ids (default-ids-for-model :model/User))
(def ^:private group-ids (default-ids-for-model :model/PermissionsGroup))

(require '[clojure.test.check.clojure-test :as ct :refer [defspec]]
         '[clojure.test.check.generators :as gen]
         '[clojure.test.check.properties :as prop]
         '[clojure.test.check :as tc]
         '[clojure.test :refer :all])

;; This should fail, which indicates that the generator CAN find all values currently in the database.
(mt/with-temp [:model/User {the-user-id :id} {}]
  (tc/quick-check 1000
    (prop/for-all [n (mg/generator user-id)]
      (not= n the-user-id))))
;; => {:fail [464],
;;   :failed-after-ms 16,
;;   :failing-size 29,
;;   :num-tests 30,
;;   :pass? false,
;;   :result false,
;;   :result-data nil,
;;   :seed 1711141620793,
;;   :shrunk {...}}

Metabase Malli Cheatsheet

In the app

  • mc/validate
  • mc/explain
  • me/humanize
  • mu/defn

dev-time helpers (occasionally used in the app itself)

  • umd/describe
  • mp/provide
  • mg/generate
  • mg/sample
(ns malli.cheatsheet
  (:require [malli.core :as mc] ;; nocommit
            [malli.error :as me]
            [malli.util :as mut]
            [metabase.util.malli :as mu]
            [metabase.util.malli.describe :as umd] ;; umd/describe
            [malli.provider :as mp]
            [malli.generator :as mg]
            [malli.transform :as mtx]
            [clojure.test.check.generators :as gen]))

Mc/Validate - Mc/Validates a value againsta a given schema.

(mc/validate int? 3)
;; => true

(mc/validate int? "3")
;; => false

mc/explain - like spec's mc/explain data. Returns nil when validation passes

(mc/explain int? 3)
;; => nil

(mc/explain int? "3")
;; => {:schema int?, :value "3", :errors ({:path [], :in [], :schema int?, :value "3"})}

(mc/explain [:map [:x [:map [:y int?]]]] {:x {}})
;; => {:schema [:map [:x [:map [:y int?]]]], :value {:x {}}, :errors ({:path [:x :y], :in [:x :y], :schema [:map [:y int?]], :value nil, :type :malli.core/missing-key})}

me/humanize - rewrites the output of mc/explain into something that is usually easier to read

(me/humanize (mc/explain [:map [:x [:map [:y int?]]]] {:x {}}))
;; => {:x {:y ["missing required key"]}}

;; malli schemas accept properties:
(mc/validate :string "")
;; => true

(mc/validate [:string {:min 3}] "")
;; => false

(me/humanize (mc/explain [:string {:min 3}] ""))
;; => ["should be at least 3 characters"]

malli schemas are extensible:

(def special-kw [:and
                 keyword?
                 [:fn {:error/message "kw name must be less than 3 long"}
                  (fn [kw]
                    (> 3 (count (name kw))))]])

(me/humanize (mc/explain special-kw :cd))
;; => nil

(me/humanize (mc/explain special-kw :bar))
;; => ["kw name must be less than 3 long"]

(me/humanize (mc/explain [:map [:record special-kw]] {:record "abc/def"}))
;; => {:record ["should be a keyword"
;;              "kw name must be less than 3 long"
;;              "kw namespace must be less than 3 long"]}

malli schemas can be as strict + detailed as we need:

;; (def Address map?)

-- vs --

(def Address
  [:map
   [:id string?]
   [:tags [:set keyword?]]
   [:address
    [:map
     [:street string?]
     [:city string?]
     [:zip int?]
     [:lonlat [:tuple double? double?]]]]])

-- vs --

(def Address
  [:map
   [:id string?]
   [:tags [:set {:min 1 :max 10} keyword?]]
   [:address
    [:map
     [:street [string? {:min 1}]]
     [:city [string? {:min 1}]]
     [:zip pos-int?]
     [:lonlat
      [:tuple
       [double? {:title "Latitude" :min -180 :max 180}]
       [double? {:title "Longitude" :min -90 :max 90}]]]]]])

(mc/validate Address {})
;; => false

(mc/validate Address {:id "EyMHW13oSVb3dXbA045xk37",
                      :tags #{:a :b},
                      :address {:street "bird rd.",
                                :city "melrose",
                                :zip 11510,
                                :lonlat [-238.79638671875 -0.01470947265625]}})
;; => true

(me/humanize
 (mc/explain Address
             {:id "EyMHW13oSVb3dXbA045xk37",
              :tags #{:a :b},
              :address {:street "bird rd.",
                        :city "melrose",
                        :zip -11510,
                        :lonlat [-238.79638671875 -0.01470947265625]}}))
;; => {:address {:zip ["should be a positive int"]}}

mu/defn - like s/defn for malli, but with better error reporting.

Checks input and output for invalid shapes, and returns high-signal error messages:

Invalid input

(mu/defn f :- int? [a :- int? b :- [:map [:x int?]]]
  (+ a (:x b)))

(try (f "1" 2)
     (catch Exception e [(ex-message e) (ex-data e)]))
;; => [":malli.core/invalid-input {:input [:cat int? [:map [:x int?]]], :args [\"1\" 2], :schema [:=> [:cat int? [:map [:x int?]]] int?]}"
;;     {:type :malli.core/invalid-input,
;;      :data {:input [:cat int? [:map [:x int?]]], :args ["1" 2], :schema [:=> [:cat int? [:map [:x int?]]] int?]}
;;      :link "https://malli.io?schema=%5B%3Acat%20int%3F%20%5B%3Amap%20%5B%3Ax%20int%3F%5D%5D%5D%0A&value=%5B%221%22%202%5D%0A"
;;      :humanized [["should be an int"]]}]

Invalid output

(mu/defn g :- int? [] "3")

(try (g)
     (catch Exception e [(ex-message e) (ex-data e)]))
;; => [":malli.core/invalid-output {:output int?, :value \"3\", :args [], :schema [:=> :cat int?]}"
;;     {:type :malli.core/invalid-output,
;;      :data {:output int?, :value "3", :args [], :schema [:=> :cat int?]},
;;      :link "https://malli.io?schema=int%3F%0A&value=%223%22%0A",
;;      :humanized ["should be an int"]}]

Devtime helpers to get a handle on what a schema means:

Describe - Given a schema, returns a description in english

(umd/describe [:maybe [:map {:title "user"} [:id int?]]])
;; => "nullable map(titled: 'user') where {:id -> <integer>}"

(umd/describe [:map
               [:name int?]
               [:docstring string?]
               [:args [:vector symbol?]]])
;; => "map where {:name -> <integer>, :docstring -> <string>, :args -> <vector of symbol>}"

Provide - Given a sequence of shapes, returns a schema that matches them.

;; > it's not always super accurate

(mp/provide [{:a 1} {:a 2} {:a 3 :b "tree"}])
;; => [:map [:a int?] [:b {:optional true} string?]]

Generate + Sample - show examples that match a schema.

  • generate returns one, and sample returns increasingly complicated examples.
(mg/generate [:vector int?])
;; => [-5023786 -218]

(mg/sample [:vector int?])
;; => ([] [] [-1 0] [0] [2] [-4 -1 -1] [2 1] [0 0] [-1] [205 -19 105 -23 -1 -35 -17 -1])

note: They don't always work:

(try (mg/generate [:vector {:min 100000} [:and int? [:= 3]]])
     (catch Throwable e [(ex-message e) (ex-data e)]))
;; => ["Couldn't satisfy such-that predicate after 100 tries."
;;     {:pred #function[malli.impl.util/f--42056--42057/fn--42063],
;;     :gen #clojure.test.check.generators.Generator{:gen #function[clojure.test.check.generators/such-that/fn--81348]},
;;     :max-tries 100}]

this can be alleviated:

(mg/generate [:vector [:and {:gen/elements [3 4]}
                       int?
                       [:or [:= 3] [:= 4]] ;; note: this is better written [:enum 3 4], but that will generate properly
                       ]])
;; => [3 4 3]

-- or --

(require '[clojure.test.check.generators :as gen])

(mg/generate
 [:vector {:title "Vector of only Prime Integers"}
  [:and {:gen/gen (gen/fmap
                   (fn gen-prime [_] (rand-nth [1 2 3 5 ,,,]))
                   gen/nat)}
   int?
   [:fn

    (fn prime? [x] (#{1 2 3 5 ,,,} x))]]])

Clone this wiki locally