Motivations

One of our internal projects at Lumber is GRID, a discovery tool and search engine of Shopify products.

Business aspects aside, my motivation for GRID is to build a fast, responsive and overall excellent product in Clojure without cutting corners in user or developer experience. It also serves as a real-world battle ground for root my data-driven UI-Resolution library.

I have written search implementations for programming environment projects that filter fuzzily or exactly by a part of the name or content of a file. That's sufficient if you know what you're looking for and want to be forgiven for fat-fingering your query. Product search is different. For example, while some users might search for a particular brand or product, others want to browse a category. For the latter case a good search experience provides hints at wider sets of results without losing track of context. To give an example, if I type "dress" I should see suggestions for particular products with dress in their name but also categories like "Fashion".

https://s3-us-west-2.amazonaws.com/secure.notion-static.com/6d6e192d-398c-4f95-a08d-48a5bdc69f8e/CleanShot_2020-07-06_at_17.26.42.gif

After evaluating solutions like Lucene and ElasticSearch I decided that the overhead of networking, data-sharing, managing indexes and additional instances is not worth the effort at this stage in the project. So I gave it a stab myself and built a hybrid search implementation in a < 100 lines of code in one day + another day of perf improvements UI implementation and QA. Besides being concise it's also fast enough for a moderately large product catalogue (5000+ items).

Read on for feature goals and implementation details.

Search Feature Goals

After some research and brainstorming my business partner Jan, we came up with the following deliverables:

and these user stories:

Implementation

Once an hour GRID pull products from numerous stores. These products land in a DataScript DB with the following schema

(def schema
  {:brand/feed-url {:db/unique      :db.unique/identity}
   :brand/name     {:db/unique      :db.unique/identity}
   :brand/products {:db/cardinality :db.cardinality/many
                    :db/valueType   :db.type/ref}
   :brand/tags     {:db/valueType   :db.type/ref
                    :db/cardinality :db.cardinality/many}
   :product/tags   {:db/valueType   :db.type/ref
                    :db/cardinality :db.cardinality/many}
   :tag            {:db/unique      :db.unique/identity}
   :tag/kebab      {:db/unique      :db.unique/identity}
   :token          {:db/unique      :db.unique/identity}
   :token/sources  {:db/valueType   :db.type/ref
                    :db/cardinality :db.cardinality/many}})

Side-note, I added :brand/name as a unique identity for the sake of this blogpost since it is possible to clash when two or more brands have the same name.