← All Posts
14 min read

Dr. Shopify, Part 1 or: How I Learned to Stop Worrying and Love GraphQL

A deep dive into my journey learning GraphQL while building a Shopify store, exploring the challenges and triumphs along the way.

shopifygraphqltypescriptkotlinstorytime

The E-Commerce Platform of Your Good (and Bad) Dreams

Shopify is a tool first and foremost - it is not a panacea to launching your business into the next Golden Age. It's a great tool for what it does, but it comes at a solid cost. Much like any other sales and e-commerce platform, it requires a decent amount of skill acquisition to get familiar with the quirks of the platform. If you are building something with a large inventory that needs synchronized d with an external data source, you are in for an uphill battle of learning the APIs and tools that you will need to get familiar with.

My history with Shopify goes back approximately a year, when I started investigating bringing a local store in my area up to date with their e-commerce setup. They previously ran Magento on a VPS (which required 16 GB of RAM, so the bill was already nearly $100/mo just for that server on Digital Ocean), and administrating this box was a royal pain. I initially tried to work with Magento, before pivoting to full-time development of the Shopify store and inventory synchronization tools.

The store I work with sells one-of-a-kind hand-knotted oriental rugs. Their total stock totals around 15000 rugs from a handful of different vendors/warehouse locations. Because these rugs are unique pieces, every single listing is unique and different. This means that I needed a way to create, maintain, and synchronize 15000 individual “products” in Shopify. The inventory “feeds” from vendors are generated and maintained either through a system that can only give me a CSV file to import from, or an Excel .xlsx file.

Needless to say, I was going to have my work cut out for me.

An Actually Clean API Surface

My first mission was getting API requests working. As always, start with reading ALL the dev docs. And when I say ALL of them, I mean ALL of them. When you're starting out, you probably won't need 85-90% of the stuff in these docs. When you need them however, it's going to be invaluable that you've made a mental note of what is and is not possible within Shopify.

Without a doubt, the Shopify API (despite its occasional quirks) is one of the most comprehensive experiences I have ever had consuming an API from a platform or provider. There is a lot that is laid out for you, and provides an everything-but-the-kitchen-sink set of tools to pull from. Basically anything you can do through the Admin UI is programmable, and has clearly defined objects and examples to work with.

The API is all-in on GraphQL however, so if that's not your cup of tea, heed warning here! I personally had not done a lot of work with GraphQL prior to this, but I certainly fell in love with it over the course of developing out this new store front. However, it is one of the most extensive and complete APIs I have ever had the pleasure of working with. This is mostly thanks to Shopify's history in the e-commerce space, and how it has evolved over the past few years.

Delegate out to the Shopify Admin Panel whenever possible

If it's not something that requires automation or specific tweaking, I would not recommend building a feature that accomplishes the same thing the Admin Panel accomplishes. The reason for this is that you're going to spend a lot of time in the Admin Panel for your store's operations anyways, and it's pretty well built. The page performance does suffer a bit sometimes, but Shopify is working on re-structuring the Admin Panel. If you're interested in a more in-depth dive, check out Craig Brunner's “Remixing the Shopify Admin” talk at Remix Jam 2025.

Tooling, Programming Languages, and Decisions

If you're starting to develop with Shopify from scratch, you do have the freedom of what language to use for interfacing with GraphQL calls. If a GraphQL client exists for your language, you're off to the races. However, Shopify has the best documentation and existing ecosystem tools based around TypeScript. If you're just starting out, get comfortable with Typescript and choose it. It will save a lot of setup time, and probably aspirin too.

The existing code-base for this store's inventory sync was already written in Kotlin (and unfortunately, a bit of a mess). For the sake of development speed, I stuck with Kotlin as feed ingest and high-level sync management code already existed. This was… a decision for sure. I like Kotlin as a language, but maybe not the fighting I had to do with Gradle and the JVM along the way. Thankfully, Apollo GraphQL has a Kotlin client, which got me up and running with great type-safe queries/mutations right away.

This is where the docs come in handy, as well as GraphQL best practices. You want to make sure to shape your queries appropriately. Check the input and output objects of the queries/mutations, and structure your requests to provide and request the minimal amount of fields you need. Your type checking comes into play here, as you should get back an object with a type specific to your query result: just the fields you need. If it doesn't exist on the type and you need it, fix your query.

Get Comfortable with Products, Variants, Options, Metafields, and GIDs

Products, and The Default Variant

Shopify structures products and variants interestingly. Recent changes to Shopify mean there is basically always a default variant on a product, even if it's not a product that could have variants (such as an individual unique oriental rug). This cost me about half a day of time by missing the docs on this, as querying for the product itself wasn't returning the data I needed. You need to query for the variant as well, using the concept of nodes and edges within GraphQL. If you have a lot of variants on a product, you'll want to take that into account as well when querying due to the limits Shopify places on returning N number of nodes in different queries.

What are Variants and Options

Have you ever seen a clothing store, where you can select a design and get a shirt in your preferred size and color? These are variants and options. Typically you'd have “Black, Gray, White” and so on as a “Color” variant, with options such as “Small, Medium, Large, Extra Large”. Shopify previously had a limit of 100 variants with 3 options each. However, in 2026, Shopify increased this limit to 2048 variants.

What are GIDs

Shopify uses Global Identifiers to refer to a single resource. GIDs are URIs and are very handy to keep track of when working with data in the API. You can see some examples of type of Shopify GIDs here. If you're making requests through your type-safe queries, you don't really have to worry about these all the time. But, what if you want to query a large dataset, such as 15000 products… when the Shopify pagination is limited to 250 products?

What are Metafields

Metafields are Shopify's way of storing metadata on most of their objects. This is custom data that you can set, for basically whatever purpose. For example, it's useful to have product data stored in Shopify to display on the product page, or to sort/filter by.

Let's start with the case of an oriental rug. The product may have details such as the following

  • The main color on the rug design, known as the “Field Color” (ivory and blue are popular ones!)
  • The colors on the border, “Border Colors” (reds and blacks are common here)
  • Any other colors, “Other Colors”
  • The material that makes up the structure of the rug, “Foundation”
  • The material that makes up the visible surface of the rug, “Pile”
  • The process and technique used to knot the rug, “Weave”
  • The size dimensions of the rug, “Width” and “Length”, typically stored in inches for conversion later
  • The size category of the rug, “Size Category”, used for quick filtering
  • The condition of the rug, “Condition”, usually “New with tags”, or “Preowned”

Check out About metafields for more information on the available types of metafields, and how they work

Quick Bonus: What are Collections?

Collections can be thought of like product categories, but they are better thought of as just an assortment of products. You can create collections that follow your products categories, or collections like "Living Room", or "Clearance", etc. They're great organizational tools

Bulk Operations: Queries and Mutations

If you need to synchronize your entire inventory ad-hoc, or request the entirety of something such as pull all of your customer records - you'll want to get familiar with Bulk Operations. Bulk operations are GraphQL requests that aren't subjected to the size limits of standard GraphQL requests. Many of Shopify's queries set pagination limits of 250 entries at a time - this gets around that, although it's annoying.

Bulk Query

A bulk query will return data in JSONL format, so each line is a new entry. If you query for 10 products with 5 variants each, you'll get back 60 lines of JSONL (1 product line, 5 variant lines each = 6 lines, times 10 products = 60 lines). If each variant had 4 options, that's (1+5+4)*10 = 100 lines. Each line can have a unique GID. You'll need to handle parsing this JSON, and rebuilding the objects you need from the dataset. Child nodes have URIs that refer to their parent GID, so using this you can rebuild the graph of data that was returned from the query or mutation (…hey…that “Graph” in “GraphQL” suddenly makes a lot more sense now doesn't it? 🥴).

Parsing this data in Kotlin was tedious, and writing a JSON parser for this with reassembled types was annoying to say the least. I never quite found a good way of doing this as I had to use JSON libraries to convert to native Kotlin/JVM objects. Again, this is where typescript comes in handy. You can use any library that utilizes the Standard Schema Specification and get instant validation and parsing of your bulk query.

Bulk Query Workflow

Check out Performing bulk queries with the GraphQL Admin API to see how this process works more in depth, but here's the gist of it:

  1. Figure out your GraphQL query. Ideally, test it outside of a bulk operation
  2. Submit a bulkOperationRunQuery with the query you want to run
  3. Poll for the operation to finish, or subscribe to the bulk_operations/finish webhook to get notified immediately
  4. Download the JSONL from the completed bulkOperation
  5. Parse, validate, and assemble objects

Bulk Mutations

Bulk Mutations work a little different than queries, but since we covered the basics of bulk queries, I'll leave you to read Bulk import data with the GraphQL Admin API, and lay out the basics here:

  1. Figure out your GraphQL mutation. and all the variables each entry needs
  2. Serialize your objects to JSONL that includes the variables
  3. Run a stagedUploadsCreate to reserve the space
  4. Upload the file to the URL returned by the create operation and wait for it to complete
  5. Run bulkOperationRunMutation with your mutation, and the staged upload URL you got earlier.
  6. Wait for the operation to finish (poll, or webhook)
  7. Download the JSONL from the completed bulkOperation
  8. Parse, validate, and assemble objects

Putting It All Together

Inventory Synchronization: A Diff System

The best way I came up with to synchronize the entire inventory at once (sort of like an ETL job) was to create a “light” diff (read: “difference”) system that checks for differences between my inventory feeds, and the data returned from a bulk query.

Here's the general idea:

  • In Parallel, fire off processes/functions that fetch all your feed data, and run the Shopify Bulk Query. For my case, I parse and validate the CSV / XLSX files into Kotlin objects. In your ideal case, hit your inventory API, or run a SQL query against your database 😉
  • Wait for your operations to finish, before moving onto the “diff” step
  • Compare your desired inventory state against the returned Shopify inventory state, and sort out what needs to change by creating “tasks”
  • Define the bulk mutations that need to be fired off, and the data from the tasks
  • Fire your bulk mutations, wait for them to complete, and analyze their returned results.

This sounds simple in theory, but is probably going to be the most complicated part of the code for your inventory synchronization. There's a lot that can go wrong here, and you might want to account for network failures, query/data errors, and generally you need to think through the heuristics of your inventory operations.

Ask yourself:

  • What if there's a failure acquiring my data source? What will happen?
    • You might accidentally set every product's inventory to a quantity of 0 (ask me how I know…)
  • What if there's a hiccup with Shopify's bulk query? If I drop packets and lose the connection, what will happen?
    • You might accidentally duplicate all of your product listings (ask me how I know…)
  • What if I don't account for all of the changes a product might potentially have, such as the title or product images updating from my data source?
    • You might end up with some stale data on a product (again… ask me how I know…)

Heuristics is the name of the game here. Plan out and document what you need synchronized, what could go wrong in the process, and what should change when. You'll have a much easier time implementing the code when you can clearly read in plain language “X should happen with Y conditions, with Z side effects”. In general, that's a good rule of thumb when dealing with any hairy business logic.

A note on the diff system

I basically sort out product tasks into the following operations:

Add - the product is not listed on Shopify yet, so create the item (productCreate mutation)

Update - the product price/quantity/details changed, so edit the item (productEdit mutation)

End - the product is out of stock and won't be restocked. Shopify doesn't let you bulk delete products, so you may want to handle this differently depending on your case. I have a job that can go through a queue of all deleted products, and send individual productDelete mutations as fast as the API can handle them. You can also just send productEdit mutations that only edit the inventory quantity without removing it, this is useful for the case where you would restock the item.

That's basically it. You're either adding products, editing them, or taking them off the platform.

Conclusion

This blog post ended up both shorter and longer than I thought it would be, as I decided to split up my chronicles in Shopify development into multiple parts. I have really enjoyed my time being a Shopify developer, though not without its complications and frustrations.

Shopify can be frustrating at times to develop for, and sometimes it seems like their developers are desperately trying to keep the spice flowing. The Shopify CEO Tobias Lütke has even said in the past (although I can no longer find the Tweet) that decisions he made in the original Ruby On Rails codebase for Shopify over a decade ago cause limitations that persist to this day in Shopify (including that aforementioned previous 100 variant limit). It's these kinds of things, and other quirks and performance issues, that makes one wonder if Shopify's commitment to embracing AI can pull them out of the odd state they are in with the state of the platform for developers.

Given that, for what Shopify provides for small to medium sized businesses, it's worth the time spent developing and building with it. The plans are reasonably priced for small businesses, and the tools you get out of the box honestly DO make you ship faster for e-commerce.

With Shopify, I deprecated that 16 GB VPS hosting Magento, and now run the inventory sync jobs on a 4 GB VPS. Shopify hosts the website through their Online Store feature. There's many more improvements I am making over time though, such as migrating off "Online Store" in favor of Shopify Hydrogen, hosted on Shopify Oxygen.

All in all, I've grown a lot as a developer overall, and especially honed my skills in physical goods sales. Stay tuned for my next planned entry in this Shopify development blog series!