July 24, 2023
Portfolio
Unusual

How Gretel found product-market fit

How Gretel found product-market fitHow Gretel found product-market fit
All posts
Editor's note: 

SFG 26: Ali Golshan on unlocking synthetic data for developers

In this episode of the Startup Field Guide podcast, Sandhya Hegde and Weil Lien Dang chat with Ali Golshan, CEO and co-founder of Gretel about the company's path to product-market fit. Gretel is a synthetic data platform that allows developers to generate artificial data sets with the same characteristics as real data so they can test AI models without compromising sensitive customer information or privacy. Gretel has a community of over 75,000 developers working with accurate, synthetic data.

Be sure to check out more Startup Field Guide Podcast episodes on Spotify, Apple, and YouTube. Hosted by Unusual Ventures General Partner Sandhya Hegde (former EVP at Amplitude), the SFG podcast uncovers how the top unicorn founders of today really found product-market fit.

If you are interested in learning more about the topics we discuss in this episode, please check out our resources on defining your ICP, finding early adopters, and finding the right co-founders.

TL;DR

  • The founding insight: Gretel started as a side project around privacy that Ali worked on with co-founders Alex and John while he was still at Stackrox. The founders wanted to solve the cold start problem of data and the bottleneck of getting access to data, which developers suffer from the most.
  • Core ICP: The founders wanted to focus on developers because they suffer the most from the cold start problem of data, which takes up roughly a third of every project.
  • Iterating to product-market fit: The market pull was more around the utility and accuracy of the output of their models versus privacy, and they ended up orienting their messaging around synthetic data.
  • Early design partners: Gretel raised a seed round from Greylock and worked on productizing an API for synthetic data, collaborating with Google on privacy, and validating assumptions about user preferences for building blocks vs. prescriptive workflows.
  • Gretel’s AI strategy: Gretel’s founders accelerated their investment in multimodal synthetic data platforms and built a framework called Gretel GPT, which has become a valuable tool for customers. 

Episode transcript

Sandhya Hegde:

Welcome to The Startup Field Guide, where we learn from successful founders of unicorn startups how their companies truly found product-market fit. I'm your host, Sandhya Hegde, and today with my cohost, Wei, we'll be diving into the story of Gretel. Started in 2020, Gretel is a synthetic data platform for developers. Gretel makes it really easy to generate artificial datasets with the same characteristics as real data, which often has things like things like sensitive customer information in it. So with Gretel, you can actually test AI models without compromising any data privacy. Over 75,000 developers use Gretel Cloud already. And today we have with us the CEO and Co-Founder of the company, Ali Golshan. 

The founding insight behind Gretel: Solving the cold-start problem of data for developers

Sandhya Hegde:

Ali, you actually started your career way back when in US intelligence, right? What was that like and how has that informed your approach to being a startup founder?

Ali Golshan:

For me that was very much an accidental path. When I was in university, I was studying computer science and math, I had some mischievous times, and I was expelled from university. And this is before it was cool to be expelled or get kicked out of university or leave school. And it had to do with a lot of what I would consider to be white hat hacking at the time. Naturally, that skillset was very interesting for the intelligence community, so I was recruited. I went to US intelligence community and focused on a lot of things around signals intelligence, reverse engineering, analytics, things like that. So, that was somewhat of an indirect, unplanned path into the US intelligence community. But it was phenomenal, because I got to work with some of the most amazingly talented people in the world. They had some of the brightest minds, especially bleeding edge technologies at the time. My two co-founders, Alex and John, that I work with now in Gretel both came out of the NSA. So just phenomenal people in that whole industry.

Wei Lien Dang:

Ali, Gretel is your third startup now. You started with Cyphort, you founded StackRox, and then now there's Gretel. I remember the first time we chatted about Gretel, you were so excited about it. I'd love to ask, what made you feel so compelled to do another company, to do a third company, and to go after this idea in particular?

Ali Golshan:

Sure. There was a lot of reasons to it. Originally when we started working on it with Alex and John, we had known each other for like a decade-plus, and this is still four, five years back, and we were always very eager to find something to work together on. So Gretel actually started as an evening and weekend hackathon, let's find an open source project and tool to work together on. And it actually initially started when we were looking at some devices in our homes just for fun and found out how much data they were moving out. And we were like, well, the world is going to become more and more edge-based. There is going to be more and more data. How do we just build some privacy tooling? So the concept really started from a standpoint of, "Let's do something together around privacy, because it's just something we're passionate about."

So, Alex Watson, one of my co-founders, he had sold his company to AWS. He was a GM there. And after four years, he's like, "F*** this. I'm out." There's only so much AWS you can take, and I'm assuming, Wei, you can attest to that. So he decided to come and actually do it full-time, and John left shortly after that with him. And part of the thesis there was that this application of privacy, — and very early on they were orienting around differential privacy — was it could be enormously valuable for this emergence of AI. And this is like 2018, 2019, before it was super buzzy.

I started putting in a little bit more time into it, because, you could probably remember, Wei, we were at StackRox. It was a pretty miserable time at that point for us. It was very exhausting, and we were just feeling a little burnt out with everything we were doing. So quite transparently, you're a founder, you're in it, but it wasn't very fun. So Gretel almost became just like side-project hobby that I worked on with Alex and John. We founded it together, but then they became involved full-time actually building the company while Wei and I were still at StackRox. That's how it all came about. And then obviously after about a year and a half, when Red Hat ended up acquiring StackRox, that's when I ended up leaving and joining Alex and John full-time at Gretel.

Wei Lien Dang:

I love this authenticity around the privacy space that you have, Ali. I think a lot of people don't necessarily realize that Gretel references the idea of there's digital breadcrumbs that you're trying to ... You know, you leave a trail, you're trying to sniff it out, protect against it, and so-

Ali Golshan:

Yeah. The whole thought process was that if you leave a single bit of digital breadcrumb, it will forever be remembered by you. So that's why we come up ... It's like, it has to be dealt with from the root. You basically have to leave no footprints for privacy to really work.

Wei Lien Dang:

And I remember when you first were starting out, you were really driving a lot of thought leadership around this space of privacy engineering and serving developers and giving them this toolkit helping them to solve privacy problems. I'm curious. We always think about the why now behind a new idea and a new company. For you guys, the three of you, what was the “why now” behind why privacy engineering? Why was it time to build a product that developers needed to solve those problems?

Ali Golshan:

The main reason for it was a few different problems. One, the three of us had always dealt with the cold start problem of data. We even saw this at StackRox. Like, hey, we want to build a cluster of tools, we want to test it. There's no data, customers can't share their data, it's all sensitive. So part of it was this bottleneck of getting access to data and the cold start problem. And people who typically suffer most from that are the developers. Eventually, that makes it downstream and somebody in marketing or sales or ops or finance uses a tool and they have to deal with the same problem. But at the ground, the person who's building it still has to test it somehow. That was a very fundamental problem that we were trying to solve.

The privacy part of it was always the driver for it, because we viewed ourselves very similar, corollary to how Apple talks about their products. You buy it for the user experience or polish or capabilities, but privacy's at the core. So when we released it, our view was that actually privacy engineering as a service, where you just have a bunch of APIs and it makes your data easy, safe, and private to work with, is the right way to orient.

And that was actually one early assumption that turned out really not to be a fit for our product and our users. The privacy messaging really in the early days tended to orient us a lot more towards privacy engineer, security engineers, compliance teams. And it kept shifting the conversation to CISOs, CIOs, CSO, people who quite frankly have no idea what the ML stack looks like. Like, what does it mean for a model to memorize data?

But the thing we were always very adamant about is, and this is based on a lot of ... You know, as founders do, they have biases, and we have all of our historic context and experiences. So we really did want to focus on the developer, because we felt like they're the ones who are suffering the most. And we remember seeing this report from Kaggle that talked about roughly a third of every project is dedicated to just getting access to data, normalizing it, and making it usable. And that was what we wanted to remove as a chunk, because if you could remove that, then building other vertical use cases or getting it a little bit more scaled was easier.

So we started with that, but very quickly we found we didn't really want to be classified with traditional security, privacy companies. And then what we also found was the more pull in the market for us was more around the utility and the accuracy of the output of our models versus the privacy, that privacy was like, okay, it's great that it's private, we want that, but how good is the accuracy and the utility?

So it became a little bit of this pull where ... In Asia and Europe, actually, privacy was pulling us. But in North America, it was more utility and accuracy. And at the same time, we started to see a lot of the emergence of the generative AI technologies and buzz. And because that became a tailwind, we ended up essentially orienting even down to our messaging around what we would consider to be the synthetic data. So we said, "Privacy first, synthetic is how we do it," and we just raised that synthetic piece to the top. And the part of it that Alex and John actually had the foresight ... 

Gretel actually started in 2019 using language models to build synthetic data, because our assumption was language models actually carry much better insights from the data itself. So there was a lot of the pieces there, but the orientation ended up changing because, like most industries, positioning and messaging almost dictates your funnel process.

How Gretel found early customers: Implementing early discoveries into the product roadmap

Sandhya Hegde:

I would love to build on that a little, if you could break it down in terms of timing. But what was the process of taking that kernel of the idea and actually finding what we like to call the first few desperate customers, like figuring out where you have pull? What were some of the activities you focused on, and how long did it take you to arrive at, "Okay, the use case is synthetic data of our testing AI modules, and this is the person who will climb the hills to get access to our data product?"

Ali Golshan:

Yeah. The original open source project really started in the end of 2019. Summer of 2019 was when me, Alex, and John started putting together some very loosely put together open source packages. And the thinking there was, can we use deep learning models with differential privacy to build synthetically generated data? That was actually the first piece that we started.

Then towards early 2020, so in February of 2020, Gretel raised a $3-1/2 million seed round from Greylock. It was Sridhar Ramaswamy who ended up investing in the company. And it was actually very much an alignment of thesis. So Sridhar was the SVP of engineering and ads for Google for about 10 years, and one of his theses was that I had dozens and dozens of engineers who did this for me, so if there was an API that did this for me, it would be very valuable to the industry.

And again, this is when I was still at StackRox. What Alex and John did is, is they very quickly productized that API, so it was behind some consumable UI. They released an early access six months later and just basically got a few hundred developers on the platform to start getting some traction and feedback.

And part of it was, one of the early assumptions we made was, if you're using synthetic data, you really don't need things like data labeling, programmatic labeling. Synthetic data comes out perfectly labeled, perfectly balanced. You don't need all of these additional tools. So we wanted to validate that, just because it made a huge difference as to where we would fit in into the ML ops workflow and the types of companies we would potentially be complementary to/partner with, versus would be a duplicate of.

And when we released that, there was a couple of early discoveries. One was we very quickly realized, because there were already synthetic data companies in the market and synthetic data was a known entity as a concept, that the association or the misperception of synthetic data was really this notion of fake data. "Oh, you need QA testing, you need volumes of volumes of testing, you can use synthetic data." But if you really want to test things, use your raw data, properly labeled data, it has the most amount of value. And we realized that, "Oh, okay, so not only is the market early, and we have to entirely change the perception of what synthetic data built with language models for AI actually means."

Then the other part of it was very early on we were collaborating with Google on privacy, and one of the things we realized was the implementation of privacy at the right time of model training makes a huge difference. To overly simplify it is, most companies take differential privacy, they train a model, they bolt it on top, they generate some sort of noise, and that basically creates differentially private data. Our team realized very early on you can train models with differential privacy, and the difference there is whether the model is sharable itself or not. If the model is sharable, you can expose that model, and essentially the eventual view was you could commercialize that model from its data, basically.

And then the third assumption that we wanted to validate and learn from is, we had a couple of pieces in our product; one we called classification and the other one was transformers. Transformers, not GPT transformers, transformers in the sense of tokenization, encryption, anonymization, as a precursor to privacy synthetics itself. So we wanted to have this industrial-grade, bulletproof privacy. And what we wanted to see was whether users and developers wanted to use these tools independently, essentially have building blocks or LEGO blocks, or they wanted to have end-to-end workflows.

And what we found out, and this was contrary to our original thesis, was developers like the notion of building blocks once they get very sophisticated and scaled, because they can call each one of those blocks through an API and build their own automation. But from step zero to seven, they just want very prescriptive, opinionated workflows.

So when we released this alpha in September of 2020, we had all those three big learnings. We took all of that, put it back into the product. Luckily, we were very transparent with Sridhar. He actually really loved that stuff. He ended up preempting our Series A, right? When we shared all these learnings with him, we thought he would be like, "What the f***? You told me this stuff works." And he's like, "Great learnings. Let's double down." So kudos to him in helping us. And that's really when we took a lot of that, started building it going forward. And it was early-to-mid 2021 that we actually ended up rolling out the GA of all these particular products. And that's roughly when I ended up joining as well.

Sandhya Hegde:

Do you remember whether there was a specific industry, like fintech? You would assume that given the value prop around privacy, that there are specific industries that are maybe early adopters, but I'm curious what you actually observed in terms of who were the first few people to be evangelists for Gretel.

Ali Golshan:

It's interesting you say that. Recently I've had some conversations with some investors, and one of the questions I always ask them is, "Where do you see synthetic?" They always reach out and they're like, "Oh, we would love to talk to you. We have this thesis." I'm like, "Tell us about your thesis." It's like, "Synthetic data's great." I'm like, "Cool, okay." But I always ask them about use cases, and they're like, "Regulated industries." And I feel like that's actually quite a lazy answer. The reason is that regulated industries, like health, finance, government, there is an amount of inbound that you get from them, but the regulated industries need to meet a minimum privacy bar or safety bar for some access to the data. Now, whether that data is balanced, biased, underrepresented, improperly distributed, enough data in itself, is actually all unanswered questions.

So, regulation is a very, very low bar, because for that data to truly have valuable output, you need a lot more prep to do with it, whether you boost it, balance it. In some cases, what we even do when we talk about accuracy of data ... We consider ourselves in the data business, not the synthetic data business. Some of our customers who have more advanced use cases, the target they set with us is, "We want to improve our downstream prediction or inference over 100%." And what that means is their raw data produces 100% as part of their baseline, and what they do with us is, is they take a bunch of these underrepresented datasets or demographics, they create more synthetic versions of them, and augment their raw data with synthetic data, so they have a much more balanced, perfected dataset.

So these were some of the things that I think became, in our opinion, more horizontal use cases. We still do maybe 10, 15% of our total pipeline generation through outbound lead generation, because it makes sense. We tend to focus overwhelming majority of it as what we call qualified inbound. As part of that, we do see regulated industries play a part in it, but as we're starting to see the first patterns of maturity, we're finding that just being a regulated industry is actually just purely the bottom of the totem pole, which is, "I just want to share my data. What I do with it needs a lot more sophistication and higher bars."

Sandhya Hegde:

What are the characteristics you look for to say, "Oh my God, that should be a good customer for us"?

Ali Golshan:

When we ended up bidding on Kubernetes about two, three years into the company, the tailwind was very clear, just a massive lift right away. The reason I mention that is that similarly things happened at Gretel, which is, once the lift of large language models, generative AI, all this stuff, started to hit, it just created the enormous amount of tailwind for us.

So really the inflection point was, "I need to use data, I need to access data. I have an enormous bottleneck or friction, and now that is around a bunch of things that naturally give you a second or third or fourth chance within a customer to be able to use you even if something doesn't work perfectly," and that helped accelerate a lot of things for us.

How Gretel anticipated and adapted to the emergence of LLMs 

Wei Lien Dang:

Ali, do you think there was any lessons learned in terms of ... At StackRox, even early on, a lot of people were like, "Eh, container security, it's only for financial services and for people who have heightened security requirements." People didn't really realize that Kubernetes was going to become as pervasive and widely adopted as it has been, right? I'm curious, in the context of AI, obviously it opens up so many use cases for a product like Gretel. How much did you anticipate that?, I know you were working with models, but did you really anticipate the wave we've seen around ChatGPT and Stable Diffusion and AI more broadly, and how it's captured people's imaginations?

Ali Golshan:

Yeah. I'm glad your question about learnings from StackRox to here was about product, not generally on people, because I was about to unload. I think in the context of product, there were a few things we did anticipate, and then there were a few that just completely ended up being accidental.

Maybe talking about things that we actually anticipated, and if you go back to our Seed deck, we always talked about this, was our premise was always that the right privacy and safety can accelerate access to data, and our thesis was with the emergence of AI, edge computing, all these pushes, people are going to need more data. You can't prevent them from not getting or collecting data, so you need a way to basically reduce the economics of data acquisition. And we wanted to figure out a way to solve for that. Now, we didn't anticipate large language models specifically to be just 99% dominant in that big category. We thought it would be a bunch of things which would push to that, which is also why we started with the privacy engineering messaging around our tooling, because we weren't just focused on large language models.

But then the things that ended up creating more and more tailwind was, and where we won quite a bit on a single differentiation, was the accuracy and the utility of our output. People were like, "I want to start looking more and more into using synthetic data for machine learning or for AI." As a result, even 5 to 10% of accuracy difference is a world of difference for us.

I would argue the thing that for us became opportunistic, not foresight, was what we called the last-mile problem, which is now one of the biggest use cases that drives us, which is ... The thesis that we developed last year as we saw some of these language models evolving was, when you have language models, right now, the economics of them are essentially you are just powering and paying cloud providers to use GPUs to a certain degree, and then they're the ones who are making money. At some point, you have to take these foundation models and economically make them viable. They become platforms, they become verticalized use cases. And the way to do that is, is you can't find that data in the public domain. So that fine-tuning, optimization. If you want to make ChatGPT your company GPT for chat, you have to train it on your data.

But how do you make sure you're not loading all your sensitive data, model memorizing all the things with rights to be forgotten? So all the privacy implications, all the way to a path of what if someday ... And this was actually within our vision of: how do we allow companies to monetize or commercialize their data? So if you're a massive financial institution and your data's valuable for you to train your models, but you also want to commercialize it safely for others to train on your data, that was a big problem we wanted to solve. So our view was that, okay, well, training on private domain data is actually the key to commercializing a lot of these foundation models that are being built and taking them that last mile.

So, I would say our thesis around the general data bottleneck held to be true. The more practical, applied version of it, what does it look like day to day, that LLM training, especially on private data and doing it safely, became more of an accelerant last year. And I think one of the very key things for us that was half-true was our assumption about our user and buyer. We started, as you all pointed out, with a developer-centric view, and we still view ourself to be the developer-centric platform. And part of it is we wanted to build around cloud-native technologies. We didn't want to build professional services. So all of this meant we would be able to build a particular product, but for you to extend that into your platform, you needed at least a bare minimum programming capability so you can write a few scripts or a few calls and be able to integrate it into APIs and other types of platforms.

But what we've learned over the last year commercializing is, while the technical champion and decision maker is the developer, they may actually never be the buyer, and the buyer may never, ever touch our product. So that 75,000 Users that we have on our platform that are building models and teaching us about how to auto-config and auto-tune models, not a single one of them may be a Buyer, and they may never actually see that platform.

Originally, we had built a lot of our funnel, from top of the funnel customer acquisition all the way down, in this orientation that we have to find a developer. But about six-to-nine months ago, we found that actually, if we break this out and say PQL is for the product-qualified User who can be the technical champion, MQL and sales-qualified rated user is actually somebody who's a buyer, and we don't build one common denominator for both of these. We can actually be much more proficient.

So it was learnings like that that were contrary to our original thinking that we had to make changes. And a lot of it goes back to what you were saying, Sandhya, is it feels like every week there's a new pattern that you have to adjust to.

I think the key things are some of our original concepts around developers wanting to build with this type of technology, synthetic data with differential privacy, and language models have held true. One of the headaches I'm glad to have avoided at Gretel compared to StackRox is, we didn't have to make that product pivot. But from a messaging, User, ICP positioning, these have all been completely new learnings. And this is something I wish I had remembered or knew at my first companies, is when we hired our first VP of sales at Gretel here, the person we hired, a guy named Jeremy, I remind him all the time, "You're not a VP of sales. You're a business officer. You're literally trying to figure out how to commercialize something in an industry that had never existed before."

So I think it's very important to bring those. And our assumption, again, was some things you could do in parallel. But what we realized is you have to build a product, then you have to build some messaging, then you have to try pricing, then you have to reiterate. It was a lot of serial processes, because in a nascent market, you figure out a base, because if you have multiple verticals dependent on each other and one holds not to be true, everything is wiped. So it was a little bit more systematic, which was probably a slower progression into the market for us. But luckily, it has turned to be a relatively fast uptake.

Sandhya Hegde:

I'm curious, you did say that, obviously, your original thesis has only gotten stronger with what has happened in terms of just how much interest there is in the entire business ecosystem to try to leverage the data they have and leverage LLMs in particular, but all kinds of foundation models. But I'm curious, there must still have been some big bets that you needed to place once you saw what was happening. So if you go back to maybe nine months or a year ago, what were some of the big unknowns for you as you saw the Stable Diffusion launch, the ChatGPT launch? What were the big questions that maybe you now have better answers for, but at the time were tough decisions to make?

Ali Golshan:

One thing that we ended up accelerating because of Stability AI's work and OpenAI's work was, from day one, and we have blogs about this, one of our major bets also in synthetics was synthetic data has to mirror real practical use cases of data, so the platform for synthetic data has to be multimodal, not just a single modality. And the thesis, actually, Wei, was very similar learnings we had in security. Build one platform if you can rather than vertical tools for every new functionality that comes up.

With that, we started with tabular data and time series data. And we had built an internal framework called Model Integration Framework. This is actually one of our core differentiators. It enables us to tune and build our own models substantially faster than a lot of other players. When DALL-E came out and when ChatGPT came out, we realized that we actually have to pull that front and we have to invest a lot more in the different modalities that we have. So that was something that certainly happened as an accelerant. We weren't planning on releasing some of the modalities as fast as we did.

And some of them ran into early friction. Some of them we had to build in an unnatural way just to find even if the user matched our core User. We tend to not think about modalities all in equal. We think about it like, based on our core users, based on our use cases, which ones have compounding value, and that's how we determine the directionality, not like, image is hot, let's go do image. So that was one that certainly changed around our roadmap a little bit, but that was just the normal pull we were seeing from customers.

The other part of it was the same framework we had built actually did something that we never anticipated would be valuable for anybody other than us, but now it's one of the biggest pulls, which is something very similar to what Mosaic does. We had built this framework we call Gretel GPT. And Gretel GPT is actually not just a generative pre-training transformer. It's actually a framework that you can plug in any opensource GPT into and it will bolt on a bunch of our own unique capabilities. Differentially private training, tuning, gradient clipping, validation modeling, reporting, text measurements for accuracy. All these enterprise-grade tools that we wanted for our own GPT.

To your point, nine months ago, maybe most LLM bake-offs were won by OpenAI. Now there's a dozen different options, and that  number's reducing. So as that trend was starting to build, we were just getting the same asks.: Can we just plug in another model in your GPT, use all this framework, run it in our own environment, apply all these privacy tools, so you can not only be a model but also a safe framework for fine-tuning, training, or experimenting with LLMs?

So we're seeing a lot of traction on that. If you look at our customer deck, it's one of the big use cases that we never anticipated to be one. But again, this goes credit to my co-founders, Alex and John. They had the foresight to build this for us, and now it's becoming a very valuable framework.

So yeah, I think there's a lot of things we've opportunistically jumped on and have had this tailwind to go with. But I think any founder will tell you in an early market you'll take any sort of unfair advantage you can get to land yourself in a customer.

How Gretel’s founders make decisions about their product roadmap

Wei Lien Dang:

Ali, I'm just curious, from a practical perspective a lot of what you're describing requires really quick adaptation to learnings that you're gaining, what's happening in the market. With AI, things are moving so quickly from week to week. How do you, Alex, and John operate in a way that allows you to really change things quickly and decisively? You know what I mean? We often think the most successful founders, one of the things that's common to them is the slope or rate of change they're able to drive. I would say at StackRox it was the opposite. We had our assumptions, and we were not adaptive enough, early on at least, and then we had to pull it back from the brink. But at Gretel, how did the three of you work together to move quickly in a sensible and rational way?

Ali Golshan:

I actually think the early years of StackRox really shaped my thinking about the early days of Gretel. At the same time, if you remember, Alex came four years out of AWS. He was a GM there. He launched three of their fastest-growing services. That all brought us to this fact that we always knew that much more sophisticated, large-scale advanced customers, especially because of privacy, will never want to use Gretel Cloud and use us as a SaaS service. But what we always figured is if we leave X amount of credits every month in perpetuity, we will build up a user base there.

So we actually use our cloud to not only enable individual users to be able to build, develop, build it into their reference frameworks, and validate all the claims we're making very openly and make it very transparent, but the biggest value we get from it is we actually introduce all of our new features in Gretel Cloud first, see how they work, expose them to that 75,000 users, get iterations, workflows, what breaks, all these sorts of things, and then we take that and we build it into our hybrid model that deploys in customers' own environments in an owned cloud and can scale.

So one area for us has really been to use that as the tip of the spear for experimentation and loading everything much faster. And while we do have some enterprise customers, it is really a little bit more of an early preview landscape for us. But it gives us much faster velocity, much faster iteration, experimentation. We even take some subset of our most advanced users, give them early previews of products. So that has been very helpful.

The other one that I can say 100% has been shaped by my experiences is, as you said, Wei, there's sometimes no good option. It's just two bad options, because you really don't know, and you just have to make some tough choices. And I think the three of us are just very comfortable making difficult choices. It partially comes down to the team. This is really a topic that I don't think is talked about enough, which is that I think the board dynamics with founders has a lot to do with how quickly you make decisions, because I think founders in a lot of times feel compelled, like, "I have to double down, we have to make this work!"

Wei, you may remember this. I remember we had a board member who like nine months after we had launched was like, "You got to give this thing time to work," and you and I were like, "We know this doesn't work. We got to change." And I think here a lot of the board top-down was like, "We're supportive, we bet on you guys because we think you know what you're doing." We had this massive amount of input data. And then we got very particular about the type of people we hired, like some of the folks that came from StackRox, like people who just wanted to innovate. And it was combination of make tough decisions, have an environment when you can get very fast, iterative feedback, so not everybody's debating philosophically, is this right, is that wrong? We can throw something out and in two weeks a few thousand people can use it.

And then the third one is, is I think myself, John, and Alex as founders have really good complementary skills, similarly to how you and I operated at StackRox. So I think it's just a bunch of different things that has given us this velocity. We had an all-hands today, board meeting last week. I do say that our product velocity and releases is something I'm very proud of. The team has done an incredible job creating high-velocity releases there.

Sandhya Hegde:

I'm taking away a very nuanced lesson here as a board member, which is don't make founders sell products they don't like.

Ali Golshan:

As long as the founders are somewhat selling to the people who are kind of like them. But if you have a founder selling to someone that is not at all like them and say, "If they don't like this, then you're probably right to put your foot down." But yeah, I think how the board interacts with founders has a lot to do with the speed, because I think...you call something a pivot, and it has such a negative connotation. But we don't call it a pivot, but we make hundreds of minor adjustments from month to month. We constantly tell the whole company, "This is the North Star, we're going to zigzag our way there. And as long as we're just generally trending the right way, that's the most important part."

So we try to shorten those lifecycles as much as possible, and I think it really helps that especially John and Alex, coming from that disciplined background of AWSes of the world, very quick iterations, customer voice ... I'll give you one example. One of the things we ended up doing was, to make the voice of the customer even more pronounced in the R&D org was, based on a lot of the learnings, which I'm sure is an entire deep dive in the AI space in itself, we actually ended up eliminating all function of SC and CX and sales. We literally removed it all, and we built a team called R&D-facing Customers in R&D, which is actually a couple of extra folks we hired with applied science and engineering backgrounds that report to engineering managers and applied scientists, but they're purely working with our customers for fine-tuning models, optimization, large-scale deployment, integrations into cloud-native frameworks. And once we realized that they're talking to their peers as PMs, they're deeply technical, they're not translating something with missing words, and that very quickly makes it back into the product and we can push it through things.

It's all those little optimizations that I think have helped. Essentially, again, having top-down support, but being tough on decisions. We've made some decisions that it's like ... I'm happy to use an example with a customer if you want, where we're like, we're going to basically shut this off and let the product speak for itself if it's going to work.

Gretel’s approach to driving adoption within traditional enterprise businesses

Sandhya Hegde:

You brought up enterprise, and one of the things we're seeing is that right now, at least when it comes to traditional enterprise businesses, maybe not large software companies, there's a lot of desire to figure out, "How do we leverage this, let's not get left behind," but there's very little in-house talent in terms of actually figuring out even how to deploy, or to, like Gretel, leave alone, recreate it in-house. What are you seeing as the biggest enablers? So you have a old-school enterprise company that maybe has a lot of data, but has never actually deployed ML in production. What do you see as the stepping stones and the drivers of true enterprise adoption here?

Ali Golshan:

I think to talk about what's the biggest enabler, we should talk about what is our biggest friction point. And our biggest friction point is really operationalizing in an automated way. And part of this, again, back to the StackRox learnings, is that in a very technical space that is very nascent, operationalizing these tools is very difficult. Somewhere else learned roughly about a year, year and a half ago that operationalizing Gretel was not really about deploying Gretel. Gretel's a few containers or a bunch of APIs or you run it in the cloud.

What is the most friction is the data that has to come to us, and where we have to send that synthetic data out of. So we actually ended up allocating a good portion of our product roadmap every release to what we call downstream-upstream connectors, like one-button connect S3, BigQuery, getting that data into us. One-button deployment on top of Kubernetes so you can deploy. Making sure all of our UI functionality is replicated in API, so if somebody's running in Vertex or SageMaker or Azure ML, they don't have to run Gretel UI. They can transact through the marketplace and plug it in and just run normally.

So I would say the biggest enabler was recognizing and admitting that sometimes it's not your product, but you just have to do things that are in your ecosystem to help your own product. And we ended up spending a lot of investment and cycles in doing things that helped our product operationalize. And once we took one step further and operationalized that, we would go further and further. But we would make those very modular, because what we don't want is to say, "Here's a opinionated end-to-end workflow, and here's how you have to run it."

So it's sort of that decision of, core product is very much product roadmap,all these integrations are either through ecosystem partners, source available, or even opensource connectors, things like that that just help. But I would say it's the two sides of the same coin.

Sandhya Hegde:

I'm curious, the questions you get from more traditional customers, do you find yourself as a company educating them a lot on the ecosystem? And where is the lift heaviest in terms of education that is maybe adjacent to what you actually do as a company?

Ali Golshan:

I'm laughing because there was an early incident where we were talking to this customer, and founders, as we all know, we're only human, and I just lost my cool. I was like, "Okay, well, we're clearly not for you, because you don't get this. Let's all move on and be happier about this." This was a couple years back. Maybe I shouldn't be saying that, but it's true. We were like, "It's just not right here."

I think some of the more traditional questions we get are not necessarily because a company is traditional, but it's because we're talking to an org that is more traditional-thinking in their approach to something. This could be a very large financial company, audit company, or a tech company, but if we come across, for some reason, their security team or privacy team or compliance team, and they're used to, you know, I run my tools in air-gapped environments, this never leaves, I run SQL in my database on-prem, and this is how you access it, I need a reverse proxy, it's more that type of thinking that for us causes the most amount of friction, which is one of the reasons why we've oriented our ICP around a developer and a cloud-native stack, because that's actually a qualify-out for us. Oh, you want us to plug into Oracle 8 in your private data center? Well, you're probably best using one of the other synthetic data companies that do that.

So I would say it's more traditional workflows. But we've intentionally stayed away from those, because traditional workflows also don't have that large overlap of consistent and normalized integration points of APIs or RPCs and things like that, which is, I think, what gets you into the trouble of we need professional services, managed services, forward-deployed scientists. We wanted to always avoid that. So for those traditional workflows, we very willingly concede those and walk away from those, because we're trying to build a little bit more of an efficient asymmetric motion for go-to-market.

Wei Lien Dang:

Ali, we always think about the urgency that a prospect has. I think what you're describing is, if there's not alignment between the problem that they're actually looking to solve, you can't convince them. It just doesn't necessarily make sense. Sometimes maybe eventually they'll get there, but I think if they don't appreciate what your product's able to do here and now, then you're not in a good position to make them be successful with them.

Ali Golshan:

You're actually hitting on something very important, though, Wei, which is the way we sell is actually very different than like how OpenAI sells. I do think there's these waves to the whole generative AI space, which is I do think most companies are thinking about it as, "What is our LLM strategy? How are we going to use it? Which ones we're going to use? How are we going to tune them? Do we fine-tune, optimize, prompt-tune? Do we do it in our own environment? Do we do it in the cloud?" There's all these questions that they're having to answer, which in a way can create artificial pressure points for your sales team to fear-sell them. Like, this company, your competitor, is betting on these two LLMs. They're going to offer X amount of less services or more sort of things like that.

For us and synthetic data, there are some companies that recognize the privacy implication, but for us it's still very much a value sell. What can we do with it? What is the outcome accuracy? What are the things that you can do with synthetic data that we couldn't do? Let us run it for two months and measure the access difference from timing for developer per developer versus original.

So some of the things that we still struggle with but are working through is how do we get those sales cycles for more large, sophisticated, high-potential customers from that six-to-nine month cycle less to three months so we can get much better forecasting? We make forecasts, but they're sort of educated guesses as to which quarter they fall into right now. We always tout about we don't have closed loss, closed no decisions, but that doesn't help when you have a hard time forecasting.

So I think for synthetics, we're still at the part of the curve where it's value sell, promissory sell. Companies that are very advanced, like some of our customers like Illumina, Snowflake, Google, they're forward-leaning. They have advanced use cases. They're like, "We know exactly what to do with this." But for I would say 75, 80% of our customers, it's like, "We believe in the thesis of synthetic data. Show us how it works in practice." And I think we need another six-to-nine months before we start to hit that tip, which is like, "We know we need it. We have a standard four-week POC cycle. We're going to validate X things."

And partially to accelerate that, we're building all these frameworks. Here's how we measure, here's how we actually tune, all these sort of things transparently. But some of it, unfortunately, just is market maturity that needs to happen and some of the cycles before us to get fully fleshed out.

Wei Lien Dang:

Part of what you're describing is just, with any such huge platform shifts, it becomes part of organizational strategy and what do we do about it, and you have a bunch of folks who are trying to think through, "Well, this is our approach," and it takes time for that to settle a bit..

Sandhya Hegde:

Yeah. But you can hopefully pursue the Holy Grail of being able to write your own RFP. You made the category.

Ali Golshan:

We've gotten exceedingly proficient at that.

Ali Golshan’s advice to early-stage founders building in AI

Sandhya Hegde:

Oh, well, Ali, what would be your advice to early-stage founders, especially first-time founders, getting started in AI in 2023?

Ali Golshan:

In all honesty, what I've found makes life a lot easier is working with very, very proficient product people, and Wei worked out really great as a co-founder for me at StackRox, so I just tried to replicate that. But I think the main advice is, I think, around two things.

One, in the general, general generative AI space is figuring out what you're releasing, where it fits into this cycle, right? And I think if you're starting with an LLM today, and even if you have a highly specialized orientation, like we're going to train it on healthcare data and healthcare tests and exams and all these sorts of things, I think it's very important to realize where in the overall cycle you're going to hit. Is that going to be more valuable, or is it more likely that companies like P&G and Johnson & Johnson are going to use Gretel and Mosaic and, well, Databricks now, and just fine-tune some existing LLM, and by the time you get yours out there's five other ones and you're basically pushed into this commoditization? And we, by the way, fully believe models will be fully commoditized. This is where we think data will be differentiated. This is our view that synthetic data can actually let your differentiated or unique data drive value.

So figuring out that curve is very important, not only because of consolidation and value, but also, what is your motion for selling going to be? If you release at the peak of hype, people are pulling you. If you're doing it as building is happening, you're basically having to demonstrate value. And if you're on that sort of cliff falling, you're basically in this race to the bottom for margins, which is not very valuable, and it's consolidation that is happening, which I really, truly believe labeling, that's where it's headed. I think every company that is doing labeling will have to pivot into something else, because labeling is getting commoditized.

The other advice I would generally give to founders is I think most founders have this romanticized view of what it's like to start a company. Yes, there will be friction with board, there will be ups and downs with product, and I think everybody knows this. But the one thing I always remind ... To the dislike of our ops team and talent team, I spend like five minutes of every interview trying to convince someone not to join us.

And part of it is I think the biggest thing that founders miss is ... It's the real extreme highs and lows of the business after the first couple of years. The first couple years as a technical founder is amazing. Your head's down, you're building. Everything's greenfield and promissory, and here's what we're going to look like as an IPO. And there's that day that you open it up, you try to commercialize, sell it, price it, support it, product feedback, cross-functionally work, that like Monday you're like, "We're going to fail," Tuesday you're like, "Oh, we won this big account, we're going to be a public company," Wednesday your big company goes away. Like, "Oh, shit."

So what I have found is, it's these extreme highs and lows and their frequency, and that just takes this massive toll on you. It's not work-life balance, it's just this ... Like, you wake up 3:00 in the morning, and that anxiety is still there. As much as I hate to quote Zuckerberg, he was talking about it. It was like, being a founder is like waking up every morning getting punched in the gut. It kind of feels like that, because nobody reaches out to you with the good stuff. That is all stuff they share in face-to-face. At like 7:00 in the morning, Slack, all the stuff that's on fire is first-to-read. So I think that's the two words of wisdom I would say I would leave with founders.

Sandhya Hegde:

Right, right. Only start a company for something you're willing to go through that for, or because you just cannot imagine working for someone anywhere. But-

Ali Golshan:

Yes. I mean, there's all the incentives, right? Or you believe this thing is economically so valuable that you're willing to put yourself through whatever it takes because you just want to make that money. But whatever it is, that motivation has to match that extreme scenario you deal with.

Sandhya Hegde:
Thank you so much for coming on The Field Guide, Ali. I'm really excited about your new product launches. Thank you so much for sharing the story. It's going to be so useful to our listeners.

All posts

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse varius enim in eros elementum tristique. Duis cursus, mi quis viverra ornare, eros dolor interdum nulla, ut commodo diam libero vitae erat. Aenean faucibus nibh et justo cursus id rutrum lorem imperdiet. Nunc ut sem vitae risus tristique posuere.

All posts
July 24, 2023
Portfolio
Unusual

How Gretel found product-market fit

Editor's note: 

SFG 26: Ali Golshan on unlocking synthetic data for developers

In this episode of the Startup Field Guide podcast, Sandhya Hegde and Weil Lien Dang chat with Ali Golshan, CEO and co-founder of Gretel about the company's path to product-market fit. Gretel is a synthetic data platform that allows developers to generate artificial data sets with the same characteristics as real data so they can test AI models without compromising sensitive customer information or privacy. Gretel has a community of over 75,000 developers working with accurate, synthetic data.

Be sure to check out more Startup Field Guide Podcast episodes on Spotify, Apple, and YouTube. Hosted by Unusual Ventures General Partner Sandhya Hegde (former EVP at Amplitude), the SFG podcast uncovers how the top unicorn founders of today really found product-market fit.

If you are interested in learning more about the topics we discuss in this episode, please check out our resources on defining your ICP, finding early adopters, and finding the right co-founders.

TL;DR

  • The founding insight: Gretel started as a side project around privacy that Ali worked on with co-founders Alex and John while he was still at Stackrox. The founders wanted to solve the cold start problem of data and the bottleneck of getting access to data, which developers suffer from the most.
  • Core ICP: The founders wanted to focus on developers because they suffer the most from the cold start problem of data, which takes up roughly a third of every project.
  • Iterating to product-market fit: The market pull was more around the utility and accuracy of the output of their models versus privacy, and they ended up orienting their messaging around synthetic data.
  • Early design partners: Gretel raised a seed round from Greylock and worked on productizing an API for synthetic data, collaborating with Google on privacy, and validating assumptions about user preferences for building blocks vs. prescriptive workflows.
  • Gretel’s AI strategy: Gretel’s founders accelerated their investment in multimodal synthetic data platforms and built a framework called Gretel GPT, which has become a valuable tool for customers. 

Episode transcript

Sandhya Hegde:

Welcome to The Startup Field Guide, where we learn from successful founders of unicorn startups how their companies truly found product-market fit. I'm your host, Sandhya Hegde, and today with my cohost, Wei, we'll be diving into the story of Gretel. Started in 2020, Gretel is a synthetic data platform for developers. Gretel makes it really easy to generate artificial datasets with the same characteristics as real data, which often has things like things like sensitive customer information in it. So with Gretel, you can actually test AI models without compromising any data privacy. Over 75,000 developers use Gretel Cloud already. And today we have with us the CEO and Co-Founder of the company, Ali Golshan. 

The founding insight behind Gretel: Solving the cold-start problem of data for developers

Sandhya Hegde:

Ali, you actually started your career way back when in US intelligence, right? What was that like and how has that informed your approach to being a startup founder?

Ali Golshan:

For me that was very much an accidental path. When I was in university, I was studying computer science and math, I had some mischievous times, and I was expelled from university. And this is before it was cool to be expelled or get kicked out of university or leave school. And it had to do with a lot of what I would consider to be white hat hacking at the time. Naturally, that skillset was very interesting for the intelligence community, so I was recruited. I went to US intelligence community and focused on a lot of things around signals intelligence, reverse engineering, analytics, things like that. So, that was somewhat of an indirect, unplanned path into the US intelligence community. But it was phenomenal, because I got to work with some of the most amazingly talented people in the world. They had some of the brightest minds, especially bleeding edge technologies at the time. My two co-founders, Alex and John, that I work with now in Gretel both came out of the NSA. So just phenomenal people in that whole industry.

Wei Lien Dang:

Ali, Gretel is your third startup now. You started with Cyphort, you founded StackRox, and then now there's Gretel. I remember the first time we chatted about Gretel, you were so excited about it. I'd love to ask, what made you feel so compelled to do another company, to do a third company, and to go after this idea in particular?

Ali Golshan:

Sure. There was a lot of reasons to it. Originally when we started working on it with Alex and John, we had known each other for like a decade-plus, and this is still four, five years back, and we were always very eager to find something to work together on. So Gretel actually started as an evening and weekend hackathon, let's find an open source project and tool to work together on. And it actually initially started when we were looking at some devices in our homes just for fun and found out how much data they were moving out. And we were like, well, the world is going to become more and more edge-based. There is going to be more and more data. How do we just build some privacy tooling? So the concept really started from a standpoint of, "Let's do something together around privacy, because it's just something we're passionate about."

So, Alex Watson, one of my co-founders, he had sold his company to AWS. He was a GM there. And after four years, he's like, "F*** this. I'm out." There's only so much AWS you can take, and I'm assuming, Wei, you can attest to that. So he decided to come and actually do it full-time, and John left shortly after that with him. And part of the thesis there was that this application of privacy, — and very early on they were orienting around differential privacy — was it could be enormously valuable for this emergence of AI. And this is like 2018, 2019, before it was super buzzy.

I started putting in a little bit more time into it, because, you could probably remember, Wei, we were at StackRox. It was a pretty miserable time at that point for us. It was very exhausting, and we were just feeling a little burnt out with everything we were doing. So quite transparently, you're a founder, you're in it, but it wasn't very fun. So Gretel almost became just like side-project hobby that I worked on with Alex and John. We founded it together, but then they became involved full-time actually building the company while Wei and I were still at StackRox. That's how it all came about. And then obviously after about a year and a half, when Red Hat ended up acquiring StackRox, that's when I ended up leaving and joining Alex and John full-time at Gretel.

Wei Lien Dang:

I love this authenticity around the privacy space that you have, Ali. I think a lot of people don't necessarily realize that Gretel references the idea of there's digital breadcrumbs that you're trying to ... You know, you leave a trail, you're trying to sniff it out, protect against it, and so-

Ali Golshan:

Yeah. The whole thought process was that if you leave a single bit of digital breadcrumb, it will forever be remembered by you. So that's why we come up ... It's like, it has to be dealt with from the root. You basically have to leave no footprints for privacy to really work.

Wei Lien Dang:

And I remember when you first were starting out, you were really driving a lot of thought leadership around this space of privacy engineering and serving developers and giving them this toolkit helping them to solve privacy problems. I'm curious. We always think about the why now behind a new idea and a new company. For you guys, the three of you, what was the “why now” behind why privacy engineering? Why was it time to build a product that developers needed to solve those problems?

Ali Golshan:

The main reason for it was a few different problems. One, the three of us had always dealt with the cold start problem of data. We even saw this at StackRox. Like, hey, we want to build a cluster of tools, we want to test it. There's no data, customers can't share their data, it's all sensitive. So part of it was this bottleneck of getting access to data and the cold start problem. And people who typically suffer most from that are the developers. Eventually, that makes it downstream and somebody in marketing or sales or ops or finance uses a tool and they have to deal with the same problem. But at the ground, the person who's building it still has to test it somehow. That was a very fundamental problem that we were trying to solve.

The privacy part of it was always the driver for it, because we viewed ourselves very similar, corollary to how Apple talks about their products. You buy it for the user experience or polish or capabilities, but privacy's at the core. So when we released it, our view was that actually privacy engineering as a service, where you just have a bunch of APIs and it makes your data easy, safe, and private to work with, is the right way to orient.

And that was actually one early assumption that turned out really not to be a fit for our product and our users. The privacy messaging really in the early days tended to orient us a lot more towards privacy engineer, security engineers, compliance teams. And it kept shifting the conversation to CISOs, CIOs, CSO, people who quite frankly have no idea what the ML stack looks like. Like, what does it mean for a model to memorize data?

But the thing we were always very adamant about is, and this is based on a lot of ... You know, as founders do, they have biases, and we have all of our historic context and experiences. So we really did want to focus on the developer, because we felt like they're the ones who are suffering the most. And we remember seeing this report from Kaggle that talked about roughly a third of every project is dedicated to just getting access to data, normalizing it, and making it usable. And that was what we wanted to remove as a chunk, because if you could remove that, then building other vertical use cases or getting it a little bit more scaled was easier.

So we started with that, but very quickly we found we didn't really want to be classified with traditional security, privacy companies. And then what we also found was the more pull in the market for us was more around the utility and the accuracy of the output of our models versus the privacy, that privacy was like, okay, it's great that it's private, we want that, but how good is the accuracy and the utility?

So it became a little bit of this pull where ... In Asia and Europe, actually, privacy was pulling us. But in North America, it was more utility and accuracy. And at the same time, we started to see a lot of the emergence of the generative AI technologies and buzz. And because that became a tailwind, we ended up essentially orienting even down to our messaging around what we would consider to be the synthetic data. So we said, "Privacy first, synthetic is how we do it," and we just raised that synthetic piece to the top. And the part of it that Alex and John actually had the foresight ... 

Gretel actually started in 2019 using language models to build synthetic data, because our assumption was language models actually carry much better insights from the data itself. So there was a lot of the pieces there, but the orientation ended up changing because, like most industries, positioning and messaging almost dictates your funnel process.

How Gretel found early customers: Implementing early discoveries into the product roadmap

Sandhya Hegde:

I would love to build on that a little, if you could break it down in terms of timing. But what was the process of taking that kernel of the idea and actually finding what we like to call the first few desperate customers, like figuring out where you have pull? What were some of the activities you focused on, and how long did it take you to arrive at, "Okay, the use case is synthetic data of our testing AI modules, and this is the person who will climb the hills to get access to our data product?"

Ali Golshan:

Yeah. The original open source project really started in the end of 2019. Summer of 2019 was when me, Alex, and John started putting together some very loosely put together open source packages. And the thinking there was, can we use deep learning models with differential privacy to build synthetically generated data? That was actually the first piece that we started.

Then towards early 2020, so in February of 2020, Gretel raised a $3-1/2 million seed round from Greylock. It was Sridhar Ramaswamy who ended up investing in the company. And it was actually very much an alignment of thesis. So Sridhar was the SVP of engineering and ads for Google for about 10 years, and one of his theses was that I had dozens and dozens of engineers who did this for me, so if there was an API that did this for me, it would be very valuable to the industry.

And again, this is when I was still at StackRox. What Alex and John did is, is they very quickly productized that API, so it was behind some consumable UI. They released an early access six months later and just basically got a few hundred developers on the platform to start getting some traction and feedback.

And part of it was, one of the early assumptions we made was, if you're using synthetic data, you really don't need things like data labeling, programmatic labeling. Synthetic data comes out perfectly labeled, perfectly balanced. You don't need all of these additional tools. So we wanted to validate that, just because it made a huge difference as to where we would fit in into the ML ops workflow and the types of companies we would potentially be complementary to/partner with, versus would be a duplicate of.

And when we released that, there was a couple of early discoveries. One was we very quickly realized, because there were already synthetic data companies in the market and synthetic data was a known entity as a concept, that the association or the misperception of synthetic data was really this notion of fake data. "Oh, you need QA testing, you need volumes of volumes of testing, you can use synthetic data." But if you really want to test things, use your raw data, properly labeled data, it has the most amount of value. And we realized that, "Oh, okay, so not only is the market early, and we have to entirely change the perception of what synthetic data built with language models for AI actually means."

Then the other part of it was very early on we were collaborating with Google on privacy, and one of the things we realized was the implementation of privacy at the right time of model training makes a huge difference. To overly simplify it is, most companies take differential privacy, they train a model, they bolt it on top, they generate some sort of noise, and that basically creates differentially private data. Our team realized very early on you can train models with differential privacy, and the difference there is whether the model is sharable itself or not. If the model is sharable, you can expose that model, and essentially the eventual view was you could commercialize that model from its data, basically.

And then the third assumption that we wanted to validate and learn from is, we had a couple of pieces in our product; one we called classification and the other one was transformers. Transformers, not GPT transformers, transformers in the sense of tokenization, encryption, anonymization, as a precursor to privacy synthetics itself. So we wanted to have this industrial-grade, bulletproof privacy. And what we wanted to see was whether users and developers wanted to use these tools independently, essentially have building blocks or LEGO blocks, or they wanted to have end-to-end workflows.

And what we found out, and this was contrary to our original thesis, was developers like the notion of building blocks once they get very sophisticated and scaled, because they can call each one of those blocks through an API and build their own automation. But from step zero to seven, they just want very prescriptive, opinionated workflows.

So when we released this alpha in September of 2020, we had all those three big learnings. We took all of that, put it back into the product. Luckily, we were very transparent with Sridhar. He actually really loved that stuff. He ended up preempting our Series A, right? When we shared all these learnings with him, we thought he would be like, "What the f***? You told me this stuff works." And he's like, "Great learnings. Let's double down." So kudos to him in helping us. And that's really when we took a lot of that, started building it going forward. And it was early-to-mid 2021 that we actually ended up rolling out the GA of all these particular products. And that's roughly when I ended up joining as well.

Sandhya Hegde:

Do you remember whether there was a specific industry, like fintech? You would assume that given the value prop around privacy, that there are specific industries that are maybe early adopters, but I'm curious what you actually observed in terms of who were the first few people to be evangelists for Gretel.

Ali Golshan:

It's interesting you say that. Recently I've had some conversations with some investors, and one of the questions I always ask them is, "Where do you see synthetic?" They always reach out and they're like, "Oh, we would love to talk to you. We have this thesis." I'm like, "Tell us about your thesis." It's like, "Synthetic data's great." I'm like, "Cool, okay." But I always ask them about use cases, and they're like, "Regulated industries." And I feel like that's actually quite a lazy answer. The reason is that regulated industries, like health, finance, government, there is an amount of inbound that you get from them, but the regulated industries need to meet a minimum privacy bar or safety bar for some access to the data. Now, whether that data is balanced, biased, underrepresented, improperly distributed, enough data in itself, is actually all unanswered questions.

So, regulation is a very, very low bar, because for that data to truly have valuable output, you need a lot more prep to do with it, whether you boost it, balance it. In some cases, what we even do when we talk about accuracy of data ... We consider ourselves in the data business, not the synthetic data business. Some of our customers who have more advanced use cases, the target they set with us is, "We want to improve our downstream prediction or inference over 100%." And what that means is their raw data produces 100% as part of their baseline, and what they do with us is, is they take a bunch of these underrepresented datasets or demographics, they create more synthetic versions of them, and augment their raw data with synthetic data, so they have a much more balanced, perfected dataset.

So these were some of the things that I think became, in our opinion, more horizontal use cases. We still do maybe 10, 15% of our total pipeline generation through outbound lead generation, because it makes sense. We tend to focus overwhelming majority of it as what we call qualified inbound. As part of that, we do see regulated industries play a part in it, but as we're starting to see the first patterns of maturity, we're finding that just being a regulated industry is actually just purely the bottom of the totem pole, which is, "I just want to share my data. What I do with it needs a lot more sophistication and higher bars."

Sandhya Hegde:

What are the characteristics you look for to say, "Oh my God, that should be a good customer for us"?

Ali Golshan:

When we ended up bidding on Kubernetes about two, three years into the company, the tailwind was very clear, just a massive lift right away. The reason I mention that is that similarly things happened at Gretel, which is, once the lift of large language models, generative AI, all this stuff, started to hit, it just created the enormous amount of tailwind for us.

So really the inflection point was, "I need to use data, I need to access data. I have an enormous bottleneck or friction, and now that is around a bunch of things that naturally give you a second or third or fourth chance within a customer to be able to use you even if something doesn't work perfectly," and that helped accelerate a lot of things for us.

How Gretel anticipated and adapted to the emergence of LLMs 

Wei Lien Dang:

Ali, do you think there was any lessons learned in terms of ... At StackRox, even early on, a lot of people were like, "Eh, container security, it's only for financial services and for people who have heightened security requirements." People didn't really realize that Kubernetes was going to become as pervasive and widely adopted as it has been, right? I'm curious, in the context of AI, obviously it opens up so many use cases for a product like Gretel. How much did you anticipate that?, I know you were working with models, but did you really anticipate the wave we've seen around ChatGPT and Stable Diffusion and AI more broadly, and how it's captured people's imaginations?

Ali Golshan:

Yeah. I'm glad your question about learnings from StackRox to here was about product, not generally on people, because I was about to unload. I think in the context of product, there were a few things we did anticipate, and then there were a few that just completely ended up being accidental.

Maybe talking about things that we actually anticipated, and if you go back to our Seed deck, we always talked about this, was our premise was always that the right privacy and safety can accelerate access to data, and our thesis was with the emergence of AI, edge computing, all these pushes, people are going to need more data. You can't prevent them from not getting or collecting data, so you need a way to basically reduce the economics of data acquisition. And we wanted to figure out a way to solve for that. Now, we didn't anticipate large language models specifically to be just 99% dominant in that big category. We thought it would be a bunch of things which would push to that, which is also why we started with the privacy engineering messaging around our tooling, because we weren't just focused on large language models.

But then the things that ended up creating more and more tailwind was, and where we won quite a bit on a single differentiation, was the accuracy and the utility of our output. People were like, "I want to start looking more and more into using synthetic data for machine learning or for AI." As a result, even 5 to 10% of accuracy difference is a world of difference for us.

I would argue the thing that for us became opportunistic, not foresight, was what we called the last-mile problem, which is now one of the biggest use cases that drives us, which is ... The thesis that we developed last year as we saw some of these language models evolving was, when you have language models, right now, the economics of them are essentially you are just powering and paying cloud providers to use GPUs to a certain degree, and then they're the ones who are making money. At some point, you have to take these foundation models and economically make them viable. They become platforms, they become verticalized use cases. And the way to do that is, is you can't find that data in the public domain. So that fine-tuning, optimization. If you want to make ChatGPT your company GPT for chat, you have to train it on your data.

But how do you make sure you're not loading all your sensitive data, model memorizing all the things with rights to be forgotten? So all the privacy implications, all the way to a path of what if someday ... And this was actually within our vision of: how do we allow companies to monetize or commercialize their data? So if you're a massive financial institution and your data's valuable for you to train your models, but you also want to commercialize it safely for others to train on your data, that was a big problem we wanted to solve. So our view was that, okay, well, training on private domain data is actually the key to commercializing a lot of these foundation models that are being built and taking them that last mile.

So, I would say our thesis around the general data bottleneck held to be true. The more practical, applied version of it, what does it look like day to day, that LLM training, especially on private data and doing it safely, became more of an accelerant last year. And I think one of the very key things for us that was half-true was our assumption about our user and buyer. We started, as you all pointed out, with a developer-centric view, and we still view ourself to be the developer-centric platform. And part of it is we wanted to build around cloud-native technologies. We didn't want to build professional services. So all of this meant we would be able to build a particular product, but for you to extend that into your platform, you needed at least a bare minimum programming capability so you can write a few scripts or a few calls and be able to integrate it into APIs and other types of platforms.

But what we've learned over the last year commercializing is, while the technical champion and decision maker is the developer, they may actually never be the buyer, and the buyer may never, ever touch our product. So that 75,000 Users that we have on our platform that are building models and teaching us about how to auto-config and auto-tune models, not a single one of them may be a Buyer, and they may never actually see that platform.

Originally, we had built a lot of our funnel, from top of the funnel customer acquisition all the way down, in this orientation that we have to find a developer. But about six-to-nine months ago, we found that actually, if we break this out and say PQL is for the product-qualified User who can be the technical champion, MQL and sales-qualified rated user is actually somebody who's a buyer, and we don't build one common denominator for both of these. We can actually be much more proficient.

So it was learnings like that that were contrary to our original thinking that we had to make changes. And a lot of it goes back to what you were saying, Sandhya, is it feels like every week there's a new pattern that you have to adjust to.

I think the key things are some of our original concepts around developers wanting to build with this type of technology, synthetic data with differential privacy, and language models have held true. One of the headaches I'm glad to have avoided at Gretel compared to StackRox is, we didn't have to make that product pivot. But from a messaging, User, ICP positioning, these have all been completely new learnings. And this is something I wish I had remembered or knew at my first companies, is when we hired our first VP of sales at Gretel here, the person we hired, a guy named Jeremy, I remind him all the time, "You're not a VP of sales. You're a business officer. You're literally trying to figure out how to commercialize something in an industry that had never existed before."

So I think it's very important to bring those. And our assumption, again, was some things you could do in parallel. But what we realized is you have to build a product, then you have to build some messaging, then you have to try pricing, then you have to reiterate. It was a lot of serial processes, because in a nascent market, you figure out a base, because if you have multiple verticals dependent on each other and one holds not to be true, everything is wiped. So it was a little bit more systematic, which was probably a slower progression into the market for us. But luckily, it has turned to be a relatively fast uptake.

Sandhya Hegde:

I'm curious, you did say that, obviously, your original thesis has only gotten stronger with what has happened in terms of just how much interest there is in the entire business ecosystem to try to leverage the data they have and leverage LLMs in particular, but all kinds of foundation models. But I'm curious, there must still have been some big bets that you needed to place once you saw what was happening. So if you go back to maybe nine months or a year ago, what were some of the big unknowns for you as you saw the Stable Diffusion launch, the ChatGPT launch? What were the big questions that maybe you now have better answers for, but at the time were tough decisions to make?

Ali Golshan:

One thing that we ended up accelerating because of Stability AI's work and OpenAI's work was, from day one, and we have blogs about this, one of our major bets also in synthetics was synthetic data has to mirror real practical use cases of data, so the platform for synthetic data has to be multimodal, not just a single modality. And the thesis, actually, Wei, was very similar learnings we had in security. Build one platform if you can rather than vertical tools for every new functionality that comes up.

With that, we started with tabular data and time series data. And we had built an internal framework called Model Integration Framework. This is actually one of our core differentiators. It enables us to tune and build our own models substantially faster than a lot of other players. When DALL-E came out and when ChatGPT came out, we realized that we actually have to pull that front and we have to invest a lot more in the different modalities that we have. So that was something that certainly happened as an accelerant. We weren't planning on releasing some of the modalities as fast as we did.

And some of them ran into early friction. Some of them we had to build in an unnatural way just to find even if the user matched our core User. We tend to not think about modalities all in equal. We think about it like, based on our core users, based on our use cases, which ones have compounding value, and that's how we determine the directionality, not like, image is hot, let's go do image. So that was one that certainly changed around our roadmap a little bit, but that was just the normal pull we were seeing from customers.

The other part of it was the same framework we had built actually did something that we never anticipated would be valuable for anybody other than us, but now it's one of the biggest pulls, which is something very similar to what Mosaic does. We had built this framework we call Gretel GPT. And Gretel GPT is actually not just a generative pre-training transformer. It's actually a framework that you can plug in any opensource GPT into and it will bolt on a bunch of our own unique capabilities. Differentially private training, tuning, gradient clipping, validation modeling, reporting, text measurements for accuracy. All these enterprise-grade tools that we wanted for our own GPT.

To your point, nine months ago, maybe most LLM bake-offs were won by OpenAI. Now there's a dozen different options, and that  number's reducing. So as that trend was starting to build, we were just getting the same asks.: Can we just plug in another model in your GPT, use all this framework, run it in our own environment, apply all these privacy tools, so you can not only be a model but also a safe framework for fine-tuning, training, or experimenting with LLMs?

So we're seeing a lot of traction on that. If you look at our customer deck, it's one of the big use cases that we never anticipated to be one. But again, this goes credit to my co-founders, Alex and John. They had the foresight to build this for us, and now it's becoming a very valuable framework.

So yeah, I think there's a lot of things we've opportunistically jumped on and have had this tailwind to go with. But I think any founder will tell you in an early market you'll take any sort of unfair advantage you can get to land yourself in a customer.

How Gretel’s founders make decisions about their product roadmap

Wei Lien Dang:

Ali, I'm just curious, from a practical perspective a lot of what you're describing requires really quick adaptation to learnings that you're gaining, what's happening in the market. With AI, things are moving so quickly from week to week. How do you, Alex, and John operate in a way that allows you to really change things quickly and decisively? You know what I mean? We often think the most successful founders, one of the things that's common to them is the slope or rate of change they're able to drive. I would say at StackRox it was the opposite. We had our assumptions, and we were not adaptive enough, early on at least, and then we had to pull it back from the brink. But at Gretel, how did the three of you work together to move quickly in a sensible and rational way?

Ali Golshan:

I actually think the early years of StackRox really shaped my thinking about the early days of Gretel. At the same time, if you remember, Alex came four years out of AWS. He was a GM there. He launched three of their fastest-growing services. That all brought us to this fact that we always knew that much more sophisticated, large-scale advanced customers, especially because of privacy, will never want to use Gretel Cloud and use us as a SaaS service. But what we always figured is if we leave X amount of credits every month in perpetuity, we will build up a user base there.

So we actually use our cloud to not only enable individual users to be able to build, develop, build it into their reference frameworks, and validate all the claims we're making very openly and make it very transparent, but the biggest value we get from it is we actually introduce all of our new features in Gretel Cloud first, see how they work, expose them to that 75,000 users, get iterations, workflows, what breaks, all these sorts of things, and then we take that and we build it into our hybrid model that deploys in customers' own environments in an owned cloud and can scale.

So one area for us has really been to use that as the tip of the spear for experimentation and loading everything much faster. And while we do have some enterprise customers, it is really a little bit more of an early preview landscape for us. But it gives us much faster velocity, much faster iteration, experimentation. We even take some subset of our most advanced users, give them early previews of products. So that has been very helpful.

The other one that I can say 100% has been shaped by my experiences is, as you said, Wei, there's sometimes no good option. It's just two bad options, because you really don't know, and you just have to make some tough choices. And I think the three of us are just very comfortable making difficult choices. It partially comes down to the team. This is really a topic that I don't think is talked about enough, which is that I think the board dynamics with founders has a lot to do with how quickly you make decisions, because I think founders in a lot of times feel compelled, like, "I have to double down, we have to make this work!"

Wei, you may remember this. I remember we had a board member who like nine months after we had launched was like, "You got to give this thing time to work," and you and I were like, "We know this doesn't work. We got to change." And I think here a lot of the board top-down was like, "We're supportive, we bet on you guys because we think you know what you're doing." We had this massive amount of input data. And then we got very particular about the type of people we hired, like some of the folks that came from StackRox, like people who just wanted to innovate. And it was combination of make tough decisions, have an environment when you can get very fast, iterative feedback, so not everybody's debating philosophically, is this right, is that wrong? We can throw something out and in two weeks a few thousand people can use it.

And then the third one is, is I think myself, John, and Alex as founders have really good complementary skills, similarly to how you and I operated at StackRox. So I think it's just a bunch of different things that has given us this velocity. We had an all-hands today, board meeting last week. I do say that our product velocity and releases is something I'm very proud of. The team has done an incredible job creating high-velocity releases there.

Sandhya Hegde:

I'm taking away a very nuanced lesson here as a board member, which is don't make founders sell products they don't like.

Ali Golshan:

As long as the founders are somewhat selling to the people who are kind of like them. But if you have a founder selling to someone that is not at all like them and say, "If they don't like this, then you're probably right to put your foot down." But yeah, I think how the board interacts with founders has a lot to do with the speed, because I think...you call something a pivot, and it has such a negative connotation. But we don't call it a pivot, but we make hundreds of minor adjustments from month to month. We constantly tell the whole company, "This is the North Star, we're going to zigzag our way there. And as long as we're just generally trending the right way, that's the most important part."

So we try to shorten those lifecycles as much as possible, and I think it really helps that especially John and Alex, coming from that disciplined background of AWSes of the world, very quick iterations, customer voice ... I'll give you one example. One of the things we ended up doing was, to make the voice of the customer even more pronounced in the R&D org was, based on a lot of the learnings, which I'm sure is an entire deep dive in the AI space in itself, we actually ended up eliminating all function of SC and CX and sales. We literally removed it all, and we built a team called R&D-facing Customers in R&D, which is actually a couple of extra folks we hired with applied science and engineering backgrounds that report to engineering managers and applied scientists, but they're purely working with our customers for fine-tuning models, optimization, large-scale deployment, integrations into cloud-native frameworks. And once we realized that they're talking to their peers as PMs, they're deeply technical, they're not translating something with missing words, and that very quickly makes it back into the product and we can push it through things.

It's all those little optimizations that I think have helped. Essentially, again, having top-down support, but being tough on decisions. We've made some decisions that it's like ... I'm happy to use an example with a customer if you want, where we're like, we're going to basically shut this off and let the product speak for itself if it's going to work.

Gretel’s approach to driving adoption within traditional enterprise businesses

Sandhya Hegde:

You brought up enterprise, and one of the things we're seeing is that right now, at least when it comes to traditional enterprise businesses, maybe not large software companies, there's a lot of desire to figure out, "How do we leverage this, let's not get left behind," but there's very little in-house talent in terms of actually figuring out even how to deploy, or to, like Gretel, leave alone, recreate it in-house. What are you seeing as the biggest enablers? So you have a old-school enterprise company that maybe has a lot of data, but has never actually deployed ML in production. What do you see as the stepping stones and the drivers of true enterprise adoption here?

Ali Golshan:

I think to talk about what's the biggest enabler, we should talk about what is our biggest friction point. And our biggest friction point is really operationalizing in an automated way. And part of this, again, back to the StackRox learnings, is that in a very technical space that is very nascent, operationalizing these tools is very difficult. Somewhere else learned roughly about a year, year and a half ago that operationalizing Gretel was not really about deploying Gretel. Gretel's a few containers or a bunch of APIs or you run it in the cloud.

What is the most friction is the data that has to come to us, and where we have to send that synthetic data out of. So we actually ended up allocating a good portion of our product roadmap every release to what we call downstream-upstream connectors, like one-button connect S3, BigQuery, getting that data into us. One-button deployment on top of Kubernetes so you can deploy. Making sure all of our UI functionality is replicated in API, so if somebody's running in Vertex or SageMaker or Azure ML, they don't have to run Gretel UI. They can transact through the marketplace and plug it in and just run normally.

So I would say the biggest enabler was recognizing and admitting that sometimes it's not your product, but you just have to do things that are in your ecosystem to help your own product. And we ended up spending a lot of investment and cycles in doing things that helped our product operationalize. And once we took one step further and operationalized that, we would go further and further. But we would make those very modular, because what we don't want is to say, "Here's a opinionated end-to-end workflow, and here's how you have to run it."

So it's sort of that decision of, core product is very much product roadmap,all these integrations are either through ecosystem partners, source available, or even opensource connectors, things like that that just help. But I would say it's the two sides of the same coin.

Sandhya Hegde:

I'm curious, the questions you get from more traditional customers, do you find yourself as a company educating them a lot on the ecosystem? And where is the lift heaviest in terms of education that is maybe adjacent to what you actually do as a company?

Ali Golshan:

I'm laughing because there was an early incident where we were talking to this customer, and founders, as we all know, we're only human, and I just lost my cool. I was like, "Okay, well, we're clearly not for you, because you don't get this. Let's all move on and be happier about this." This was a couple years back. Maybe I shouldn't be saying that, but it's true. We were like, "It's just not right here."

I think some of the more traditional questions we get are not necessarily because a company is traditional, but it's because we're talking to an org that is more traditional-thinking in their approach to something. This could be a very large financial company, audit company, or a tech company, but if we come across, for some reason, their security team or privacy team or compliance team, and they're used to, you know, I run my tools in air-gapped environments, this never leaves, I run SQL in my database on-prem, and this is how you access it, I need a reverse proxy, it's more that type of thinking that for us causes the most amount of friction, which is one of the reasons why we've oriented our ICP around a developer and a cloud-native stack, because that's actually a qualify-out for us. Oh, you want us to plug into Oracle 8 in your private data center? Well, you're probably best using one of the other synthetic data companies that do that.

So I would say it's more traditional workflows. But we've intentionally stayed away from those, because traditional workflows also don't have that large overlap of consistent and normalized integration points of APIs or RPCs and things like that, which is, I think, what gets you into the trouble of we need professional services, managed services, forward-deployed scientists. We wanted to always avoid that. So for those traditional workflows, we very willingly concede those and walk away from those, because we're trying to build a little bit more of an efficient asymmetric motion for go-to-market.

Wei Lien Dang:

Ali, we always think about the urgency that a prospect has. I think what you're describing is, if there's not alignment between the problem that they're actually looking to solve, you can't convince them. It just doesn't necessarily make sense. Sometimes maybe eventually they'll get there, but I think if they don't appreciate what your product's able to do here and now, then you're not in a good position to make them be successful with them.

Ali Golshan:

You're actually hitting on something very important, though, Wei, which is the way we sell is actually very different than like how OpenAI sells. I do think there's these waves to the whole generative AI space, which is I do think most companies are thinking about it as, "What is our LLM strategy? How are we going to use it? Which ones we're going to use? How are we going to tune them? Do we fine-tune, optimize, prompt-tune? Do we do it in our own environment? Do we do it in the cloud?" There's all these questions that they're having to answer, which in a way can create artificial pressure points for your sales team to fear-sell them. Like, this company, your competitor, is betting on these two LLMs. They're going to offer X amount of less services or more sort of things like that.

For us and synthetic data, there are some companies that recognize the privacy implication, but for us it's still very much a value sell. What can we do with it? What is the outcome accuracy? What are the things that you can do with synthetic data that we couldn't do? Let us run it for two months and measure the access difference from timing for developer per developer versus original.

So some of the things that we still struggle with but are working through is how do we get those sales cycles for more large, sophisticated, high-potential customers from that six-to-nine month cycle less to three months so we can get much better forecasting? We make forecasts, but they're sort of educated guesses as to which quarter they fall into right now. We always tout about we don't have closed loss, closed no decisions, but that doesn't help when you have a hard time forecasting.

So I think for synthetics, we're still at the part of the curve where it's value sell, promissory sell. Companies that are very advanced, like some of our customers like Illumina, Snowflake, Google, they're forward-leaning. They have advanced use cases. They're like, "We know exactly what to do with this." But for I would say 75, 80% of our customers, it's like, "We believe in the thesis of synthetic data. Show us how it works in practice." And I think we need another six-to-nine months before we start to hit that tip, which is like, "We know we need it. We have a standard four-week POC cycle. We're going to validate X things."

And partially to accelerate that, we're building all these frameworks. Here's how we measure, here's how we actually tune, all these sort of things transparently. But some of it, unfortunately, just is market maturity that needs to happen and some of the cycles before us to get fully fleshed out.

Wei Lien Dang:

Part of what you're describing is just, with any such huge platform shifts, it becomes part of organizational strategy and what do we do about it, and you have a bunch of folks who are trying to think through, "Well, this is our approach," and it takes time for that to settle a bit..

Sandhya Hegde:

Yeah. But you can hopefully pursue the Holy Grail of being able to write your own RFP. You made the category.

Ali Golshan:

We've gotten exceedingly proficient at that.

Ali Golshan’s advice to early-stage founders building in AI

Sandhya Hegde:

Oh, well, Ali, what would be your advice to early-stage founders, especially first-time founders, getting started in AI in 2023?

Ali Golshan:

In all honesty, what I've found makes life a lot easier is working with very, very proficient product people, and Wei worked out really great as a co-founder for me at StackRox, so I just tried to replicate that. But I think the main advice is, I think, around two things.

One, in the general, general generative AI space is figuring out what you're releasing, where it fits into this cycle, right? And I think if you're starting with an LLM today, and even if you have a highly specialized orientation, like we're going to train it on healthcare data and healthcare tests and exams and all these sorts of things, I think it's very important to realize where in the overall cycle you're going to hit. Is that going to be more valuable, or is it more likely that companies like P&G and Johnson & Johnson are going to use Gretel and Mosaic and, well, Databricks now, and just fine-tune some existing LLM, and by the time you get yours out there's five other ones and you're basically pushed into this commoditization? And we, by the way, fully believe models will be fully commoditized. This is where we think data will be differentiated. This is our view that synthetic data can actually let your differentiated or unique data drive value.

So figuring out that curve is very important, not only because of consolidation and value, but also, what is your motion for selling going to be? If you release at the peak of hype, people are pulling you. If you're doing it as building is happening, you're basically having to demonstrate value. And if you're on that sort of cliff falling, you're basically in this race to the bottom for margins, which is not very valuable, and it's consolidation that is happening, which I really, truly believe labeling, that's where it's headed. I think every company that is doing labeling will have to pivot into something else, because labeling is getting commoditized.

The other advice I would generally give to founders is I think most founders have this romanticized view of what it's like to start a company. Yes, there will be friction with board, there will be ups and downs with product, and I think everybody knows this. But the one thing I always remind ... To the dislike of our ops team and talent team, I spend like five minutes of every interview trying to convince someone not to join us.

And part of it is I think the biggest thing that founders miss is ... It's the real extreme highs and lows of the business after the first couple of years. The first couple years as a technical founder is amazing. Your head's down, you're building. Everything's greenfield and promissory, and here's what we're going to look like as an IPO. And there's that day that you open it up, you try to commercialize, sell it, price it, support it, product feedback, cross-functionally work, that like Monday you're like, "We're going to fail," Tuesday you're like, "Oh, we won this big account, we're going to be a public company," Wednesday your big company goes away. Like, "Oh, shit."

So what I have found is, it's these extreme highs and lows and their frequency, and that just takes this massive toll on you. It's not work-life balance, it's just this ... Like, you wake up 3:00 in the morning, and that anxiety is still there. As much as I hate to quote Zuckerberg, he was talking about it. It was like, being a founder is like waking up every morning getting punched in the gut. It kind of feels like that, because nobody reaches out to you with the good stuff. That is all stuff they share in face-to-face. At like 7:00 in the morning, Slack, all the stuff that's on fire is first-to-read. So I think that's the two words of wisdom I would say I would leave with founders.

Sandhya Hegde:

Right, right. Only start a company for something you're willing to go through that for, or because you just cannot imagine working for someone anywhere. But-

Ali Golshan:

Yes. I mean, there's all the incentives, right? Or you believe this thing is economically so valuable that you're willing to put yourself through whatever it takes because you just want to make that money. But whatever it is, that motivation has to match that extreme scenario you deal with.

Sandhya Hegde:
Thank you so much for coming on The Field Guide, Ali. I'm really excited about your new product launches. Thank you so much for sharing the story. It's going to be so useful to our listeners.

All posts

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse varius enim in eros elementum tristique. Duis cursus, mi quis viverra ornare, eros dolor interdum nulla, ut commodo diam libero vitae erat. Aenean faucibus nibh et justo cursus id rutrum lorem imperdiet. Nunc ut sem vitae risus tristique posuere.