product potion
Posts
How Homebodiy Automatically Generates Supply Lists for your Home Projects

How Homebodiy Automatically Generates Supply Lists for your Home Projects

Using OpenAI and the Vercel AI SDK's Structured Object Streaming

Dylan Meade
August 09, 2024

Hi 👋🏼

I’m Dylan, a co-founder at Homebodiy, the digital toolbox for homeowners. I’m also our product engineer, so I spend a ton of time talking to homeowners about why homeownership is hard and what we could do to make it easier.

Our ‘digital toolbox’ is a whole collection of software tools designed to make home improvement and maintenance easier, more rewarding, and more accessible. As part of that, we recently launched a tool called Projects: a way to plan, track, and manage all the home improvement projects you’re considering or have underway.

Why are supply lists hard to make?

As I chatted with homeowners, I heard over and over again how unknown unknowns make it hard for them to plan out their projects. If you don’t know all the steps you need to take to complete a project, how could you possibly know what supplies you’ll need? What if your project is different than what you found online? Figuring this out would take a lot of googling. You would need to make a plan, understand all the steps, and then note down the materials you’d need for each step. An expert tradesperson could do this for you, of course, but it would take a consultation and some amount of time and money (they’re doing exactly what you would do, but they have far fewer unknowns to deal with due to their experience and trade knowledge).

For a lot of homeowners, those unknowns prevent them from even considering to DIY their projects. We’re trying to shed light on those unknowns, one at a time, until they’re confident enough in their understanding and ability to just start.

Okay - so how did we make it easy?

Traditionally, we would have tried to solve this by sending your project idea off to an expert and sending you back the supply list they created. That would be expensive, slow, and require quite a few manual processes.

Using AI, we could make this fast and cheap, but the response comes back in a single blob of text, so we still have to do work to translate that into distinct supply list items.

If only the AI could give us back structured data so that we didn’t need to manually parse and create the items…

Good news! It can now.

This week, OpenAI released official support for returning structured objects from certain models. The Vercel AI SDK has supported this for a bit now, and their version 3.3 update goes further by allowing you to stream structured data back to the client.

That’s huge.

Here’s what that looks like, in practice, when generating a supply list for a home improvement project:

This turns hours of research into a 10 second wait, and you can watch as the items are added to the list.

What made this possible?

Until now, interfacing with large language models (LLMs) meant sending in some instructions and trying to coerce the model to output a blob of text that was roughly in the format you wanted. Most of the time, that meant putting instructions like this into your prompt:

please format your response in json. do not return anything besides the json object. I will parse this response directly. use this format:
{
  cars: [
    {
      make: string,
      model: string,
      year: number,
      costUSD: number,
    }
  ]
}

Depending on the model you’re using, doing this may or may not give you what you want. It definitely wasn’t guaranteed, and if you were relying on the data to come back in that very specific format, then you might need to run the response back through a validator and then rerun the query if validation failed.

This made it hard to use LLMs for structured data, i.e. creating things that you would store in a database for future use. This, to me, is one of the core reasons why LLM interfaces have largely been relegated to chatbots. The novelty of using natural language was of course a large factor, but not being able to control what the LLM spits out with any reasonable certainty meant that creating more complex interfaces required a ton of data validation, coercing, and query retries. Now couple that with building modern, authenticated, and dynamic interfaces, and you’ll quickly see why teams didn’t feel it was worth the hassle to build anything that wasn’t chat.

With this new support for structured data, that difficulty melts away. You can create what your customers want on demand.

The technical bits

Now I’ll give a quick overview of how we did this. Vercel’s docs for this are great, so my implementation isn’t much different than what they show.

First, let’s define the use case. Having a good understanding of the inputs and outputs will help us define everything else.

We want to generate a supply list for a home improvement project. We know that we have access to the project’s data (like title, description, budget, etc). We also want the user to feel involved in the process, so we’ll make them press a button to trigger the generation. Once the button is pressed, the user will see a new Supply List card in a loading state with the number of items ticking up as they’re created. When it’s finished generating, the card will become clickable, taking the user to the full collection.

Okay, now that we know what the inputs are and what the user outcome should be, let’s define all the stuff we need to make this work:

Schema for our structured output
Which model we want to use
A system prompt that tells that model how to approach its task
A user prompt that includes the project description and budget
Functions to save the generated items to our database

Let’s check out how these all fit together.

First, let’s define the schema of the desired output. Vercel and OpenAI both let you do this with Zod, a TypeScript schema validator, so that’s what I used (and so should you - Zod is great). We need the model to spit out a list of items that we can put into the database, so the structure of those items should be something that we can easily transfer over. The field names don’t have to be exactly the same, but we do want to make sure the types and general structure are right.

Keep in mind that your schema will be informed by what you plan on doing with the data, but here’s mine:

export const GeneratedCollectionItemsSchema = z.object({
	items: z
		.array(
			z.object({
				title: z.string().describe("The title of the item"),
				description: z
					.string()
					.describe("The description of the item and its purpose"),
				tags: z
					.array(z.string())
					.length(2)
					.describe(
						"The tags of the item. You can use this to categorize the item, like 'consumable', 'material', 'tool', etc.",
					),
				quantity: z
					.number()
					.describe("The quantity of the item needed for the project"),
			}),
		)
		.describe("A list of items to be supplied for the project."),
	collectionId: z
		.string()
		.describe(
			"The id of the collection to add the items to. It will be provided by the client.",
		),
});

This might look a little messy with the constrained code block, but it’s actually super easy to wrangle. The main things you want to observe here are:

You can represent your expected JSON response declaratively. You state what shape you want and let the LLM handle the content.
You describe each property to be filled. While not technically required (the LLM can sometimes infer based on the property name), describing the property gives the LLM a better idea of what to put where. The ‘tags’ in the code block are a good example of that, indicating some tag examples to pull from.

Next we pick our model. We’ll use GPT-4o-mini because it should be sufficient and cost effective, but you should choose the model based on your use case. Check the latest documentation for notes on structured object model compatibility. If you’re using the Vercel AI SDK like I am, you have more options (for now).

Then you’ll write your system prompt. This is the core instruction the model will use. Here’s an example of what you could do for this case, but not what I did exactly:

const createSupplyListSystemPrompt = `You are a helpful assistant that creates a supply list for a project. The supply list should be a list of items that are needed for the project.`;

You should tweak this prompt repeatedly to get exactly the kinds of results you’re looking for, but I won’t go into prompt iteration here.

Next, figure out what you’re going to put in your user prompt. This is generally going to be context and user data that you want the LLM to act on. I’ll just be passing in a short description of the project.

const userPrompt = `Please create a supply list for the following project based on its description: ${projectDescription}`

Now, put it all together. I used the AI SDK’s new ‘streamObject’ and ‘useObject’ functions for this. streamObject is what you run on the server, and it takes all the stuff we just defined, calls the LLM, and returns a response that allows you to read it as a stream. This is how we’ll stream in the ‘# of items’ text as the items get generated.

Supply the model, system prompt, schema, user prompt, and then a function to handle the object data. ‘onFinish‘ will get called automatically once the LLM is finished running, so it’s a good place to send stuff off to your database, throw some logs, or whatever else you need to do with the completed object.

const result = await streamObject({
	model: openai("gpt-4o-mini"),
	system: createSupplyListSystemPrompt,
	schema: GeneratedCollectionItemsSchema,
	prompt: userPrompt,
	async onFinish({ object }) {
		console.log("Object", object);
		if (object) {
            // if the object is successfully generated,
            // put them in the database
			await createSupplyListItems(object);
		}
	},
});

return result.toTextStreamResponse();

The useObject piece is for the client. It’s listed as experimental in the AI SDK docs right now, so use caution (it might be unstable or change in the future). It’s a react hook that gives you back a few helpers: submit, isLoading, and object. ‘submit’ is a function that you hook up to whatever you want to trigger the generation. object holds the structured response from the LLM. And isLoading is just a boolean that lets you know whether or not the object is currently being created.

When instantiating the hook, just tell it where your API endpoint is (this should be the endpoint that will give you back the streamObject response from earlier), what schema to expect for the object, and anything else you want to happen when everything is finished.

Here’s what that looks like:

const { submit, isLoading, object } = useObject({
		api: "/api/supplylist",
		schema: GeneratedCollectionItemsSchema,
		onFinish({ object }) {
			console.log(object);
			// optimistically update your client state
		},
	});

And that’s it! The only thing left to do is hook those helpers up to our UI. Here’s a super simple example of how that could look, where the button triggers the submit and the Card is rendered once the object starts being generated:

   <div>
        <button type="submit" disabled={isLoading}
          onClick={submit}
        >
         Generate supply list
        </button>

      {isLoading && object?.items && (
        <Card items={object.items} />
      )}
    </div>

Wrapping up

Thanks for reading! If you like these in-depth looks at building consumer AI products, subscribe to the newsletter.

You can check out Homebodiy here.