FOSS

Episode 70: Early-Stage Venture Capital, OpenOcean, with Patrik Backman

Intro

Mike: Hello and welcome to Open Source Underdogs! I’m your host Mike Schwartz, founder of Gluu, and this is episode 70 with Patrick Backman, General Partner, CEO and Co-Founder at OpenOcean, and Co-Founder of MariaDB.

This episode uncharacteristically diverges from the normal format. It’s more of a general interview than a discussion on the business model of a specific company. We recorded it a few weeks back on March 1st. The release was delayed because I wanted to prioritize Kevin Muller’s pitch for attending the Open Source Founder Summit in Paris this coming May.

And if you’re on the fence about the Founder Summit, there’s still time. Last night, I had some barbecue with Frank Karlitschek, founder of NextCloud, and he told me that NextCloud did 80% growth on a revenue base in excess of $10M. So, if you want to find out how we did that, and you have the resources, go hang out with him and lots of other talented founders in Paris at the Open Source Founder Summit.

And, as this is a bit of an unconventional episode, maybe now is a good time to set some expectations for The Underdogs podcast for the next couple of years. My plan is to complete around six episodes per year for the next five years and finish at around 100 episodes.

The 100th episode will be Gluu, and you’ll find out how I was able to put all these years of feedback to work in my own startup.

I’ll be recording in person at FOSDEM, SO-CON and All Things Open. If you feel like your startup should be on the Underdogs podcast, or you have a suggestion for a startup that I should reach out to, send me a message on LinkedIn. I’m always on the lookout.

The other reason I’m slowing down is that Emily Omier’s podcast The Business of Open Source is doing a great job, documenting firsthand accounts from business leaders in the open-source world.

If Emily’s podcast existed in 2018, I might not have even started Underdogs. So, if you haven’t listened to her podcast, you should add it to your queue. And thanks Emily for all the hard work, a lot of logistics preparation and post-production work goes into each episode, so 200 episodes doesn’t just happen without a lot of commitment.

With that lengthy intro behind us, let’s finally get back to the interview with Patrik Backman. He’s a bit of walking internet history. And it was a lot of fun to get an opportunity to chat with him, so here we go. Patrik, thank you so much for joining us today.

Career Overview

Patrik: Yeah, it’s a pleasure, thank you.

Mike: Maybe just for the audience you could tell us a little bit about your background, and how you ended up entering the database business.

Patrik: I started actually out of school, out of university, I got recruited into MySQL, or MySequeL, as they say in the US, in 2003, and then I worked six years as part of MySQL database company. From couple of million in revenue, or almost 100, from 50 people to about 600, and then, in the last couple of years, I was part of the management team, I run a large part of product development, and also some business development.

And then, following the MySQL time, we exited to Sun Microsystems in 2008 for $1 billion – that was a great achievement, at least in the European, so to say, start of ecosystem. Following that, we started exploring how to build a venture-capital business. And parallel with that, we co-founded a company called MariaDB, which is a successor company to MySQL. But since 2011, I’ve been operating as a general partner, together with two other partners with OpenOcean, which does kind of invest in the early-stage data software in Europe.

Years at Sun

Mike: I have a Sun Microsystems’ coffee cup. I’m just curious, as somebody who had actually worked at Sun for a while, if you could talk a little bit about what was Sun like?

Patrik: It was quite a remarkable journey. After being acquired by Sun, I was part of the Sun for almost a year, and also part of the integration kind of tasks force to integrate the MySQL company into Sun Microsystems. But I think there was extraordinary talent around engineering overall in the business – it was of course a huge company – 35,000 people – but extraordinary talent, especially a software team which I met a lot, however, the commercial side was quite awkward to me. At the time, much of the philosophy seemed to be — what I noticed was that software was open-source, and for free. And then, the point was to sell hardware.

So, me coming in, with MySQL and still thinking that, ‘hey, it’s an interesting scaling business, there’s a lot of good ways to make business around open source’, Sun didn’t really have that mentality. And we were quite a lot discussing their how could kind of the commercial side be developed around the open-source products that Sun had and many, many great ones.

But then, I think the business side of that was a bit lacking, and they never really got to build that out – the company almost went into insolvency, or big difficulties, and then Oracle acquired it at a quite low price.

Inevitability of Oracle Owning MySQL

Mike: Do you think it’s somewhat ironic that Oracle, the prototypical commercial database company, ends up owning MySQL in the end?

Patrik: Well, what I’ve heard, and this is just ,of course, a rumor – so, “maybe” what people suspect – but I heard rumors that Larry Ellison always kind of wanted MySQL, to have control of it, because it was so popular and it had such a huge impact in the database industry, in terms of popularity, and in terms of power in the web and internet overall.

So, I think there was even some rumors that Oracle even earlier tried to get their hands on it, so to say, one way or the other, especially Sun Microsystems acquisition. Supposedly, it was much tied to the fact that that way they got control over MySQL.

More restrictive licenses resulting from Foss strip mining

Mike: I want to Pivot to MariaDB. I spoke with Michael Howard in 2018, and he was promoting BSL, or Business Source License, which HashiCorp later adopted also after a billion downloads of Terraform. Do you think that companies moving to these new restrictive licenses is a signal that the industry is standing up for itself, and that it’s sort of growing tired of the exploitive use of its software?

Patrik: Yes, absolutely. I think it’s a clear signal. At MySQL, 15 years ago, there was this — maybe today it seems naive thinking that ‘hey, if AWS, for instance, and Google, and Facebook use a lot of MySQL, somehow it will drive a commercial relationship down the line’, which is kind of beneficial to both parties and grateful for everyone.

However, it never really got there. It turned out that nowadays AWS has millions of these installations for their customers, and making a lot of money on them, but still paying really nothing for it. So, it has been clear that big players out there, they really want to use open source as far as they can for free, but themselves making a lot of money on it.

As this has become very evident that money will easily come to this popular open-source platform, then the restrictive license is what you need to go for, so to say. You need to be smart about your licenses so that it drives things to your business needs. Then, a question comes up, of course, for founders out there: what is the business need and the business ambition that the founder has, and different licenses serve different business models.

What is the best open-source license?

Mike:  So, let’s say I invent a new database called PigeonDB, and I want to license it somehow. Well, MariaDB used BSL licensing. Did that work? Like, what license should I use?

Patrik: As a founder, if you want to build a small kind of services business, you can use a free license as you want, very permissive. In that way, it gets maximum popularity, and spread, and awareness and adoption. Then, you can sell some support, or some fixes, or some advisory, or consultancy to users out there. And that’s a good business. You can even hire a few people, maybe tens of people, making even a few millions a year, and that’s a nice lifestyle business. That is a perfectly safe and good choice to make.

However, if you want to do more of a product-oriented business and earn licenses, or earn kind of product revenue for your product, then you need to have some restrictions on that distribution that you have. And I think that MongoDB is a perfect example of that. In popularity, I think MySQL and even MariaDB might be even more popular than Mongo. However, Mongo is making much more money than MySQL or MariaDB for that matter. So, having a restrictive license that prevents, so to say, kind of AGPL preventing the big player from just kind of hosting their own service database services and taking all the money for it.

Do we need a new open-source movement?

Mike:  Bruce Perens at State of Open mentioned the idea that maybe we need a new category of licenses. Do you have any idea if we did, what that would look like?

Patrik: I don’t know directly. I think there’s a good variety of licenses, and of course, one could always write their own, so to say, business source license is a new one created at MySQL and used at MariaDB. What open-source movement kind of have tried to do, or have done for 30 years, is collaborative work around software – it’s mostly the licenses have focused on one downloadable kind of software package that you take and install, and use kind of standalone, or in a combination with other software.

Now, most companies around open source, and also elsewhere, are making money by providing a service online. So, it’s a software service online, a cloud service. And somehow, it’s interesting that, in my mind, there’s a lot of enterprises combining open-source software, using it in their cloud setting for their own needs, but this glue – how they put things together, how they manage it, and so forth – this glue is typically proprietary.

Either they make it themselves, buy some proprietary by someone else, or buy this glue, so to say, from others, and somehow, an openness on this kind of glue between software and how it’s used in a cloud setting, maybe there could be some open-source movement on that. But that probably requires a whole new way of thinking. This is just an early thought, but something I’ve been pondering about.

Is Open Source a Tragedy of the Commons?

Mike: We have a professor here at UT Austin, who’s compared open-source software to a tragedy of the commons. Typically, in a tragedy of the commons, there’s a “freeloader problem”, where the freeloader is causing the degradation of the resource. I kind of argue that, well, what if the resource here isn’t just the software – because there’s no incremental cost to copying the software – but what if the incremental cost is actually maintenance of the project. Because for infrastructure projects like databases, what people really want is a stream of updates to the dependencies.

So, I’m wondering if you could sort of comment on the idea of like, is this “tragedy of the commons” analogy useful, and how do we avoid freeloaders?

Patrik: I mean, it’s built into the model that is open source. If you provide it with a permissive license, then people can use it for free and take benefit, and even maybe in a business way, in a wrong way so to say, take benefit of it, and so forth. But I think what should be recognized, and understood, and respected is that, indeed, anyone building software, even providing a free and open-source software, they should be allowed to have a credible and sustainable business model in order for them to be able to maintain and further develop it. But, then, it’s about what kind of balance you want to strike with what is free and what is commercial, and that’s where the license comes in.

Can the Government Help Fund Open Source Infrastructure?

Mike: There’s over four million open-source projects now. If you add up all the packages, and the repos, and especially the MPM repositories – excluding GitHub, who knows what it is if you include GitHub – there’s so much diversity in open-source projects. It seems like what you’re really saying is not that open source in general should have a business model, but certain types of infrastructure.

The government, let’s say, was going to fund open-source infrastructure to sort of address this freeloader problem – how would they know which ones to invest in?

Patrik: I don’t really know if that can be steered by government, or by any kind of regulating body, or any smart individual in that sense. I believe in the free market that there’s a lot of people can develop, and release, and provide software under different licenses and commercial models, and that it can be free or commercial, or closed-source or open-source. But in the end, it’s up to anyone in need of a certain solution then to decide what they take and on what premises.

I mean, someone might be okay to take something free and be ready to maintain it themselves, and not care about it if there’s a commercial side to it, or a company behind it, or even a developer behind it. Or vice-versa: someone wants to pay something for strong assurance that it’s managed by a company, and resources, and everything.

Beware VC Advice

Mike: I always caution founders that the advice you will get from venture capitalists is the advice of how do you make this into a really high-growth business.

Patrik: Yes. I could also give advice on how to just get maximum spread in different ways. However, what we, of course, think about as venture capitalists is that we try to find those businesses and help those businesses, or founders, that really want to build something significant in terms of a business and not just in terms of following.

Community Lead Growth?

Mike: What I’ve noticed is a trend towards something called community-led growth, where the company raises a bunch of venture capital. And then, they start an open-source project, and try and find a really devoted group of super fans who love what they’re doing, and then they look for a monetization strategy out of that. Have you been seeing that model more, and do you have any thoughts about is that going to lead to high- growth companies, or is it just a risky investment?

Patrik: Well, it is risky, it can work, but it’s often hard. I think specifically what is hard is that those founders, who are able and skillful enough to create a super-quality software that spreads and becomes open-source kind of very, very popular, the way to make money is often: yes, you can do an open-core model and have some commercial features, so to say.

And there’s a few examples of who have succeeded. I mean, Gitlab, and so forth, have succeeded in building kind of more than 100 million in revenue, but no one has really done with open-core kind of billions in revenue. And today, the way to really make money is by providing a cloud service – that’s what Mongo is doing very successfully, and many others.

And to build the challenge for those founders who create a very popular software distribution, kind of open-source software distribution, the mentality, and mindset, and skill set for building a cloud service is very, very different from building a great piece of software that you provide for downloading open source. So, often, the challenge is, and the risk is, how do you bring in that skill if you want to build a very large business.

Should Governments Use Open Source?

Mike: Patrik, you’re in Helsinki in Europe, and there’s a lot of innovation towards let’s say driving digital public infrastructure in the government. Should the government use open-source software, and how should the government consume and procure open-source software?

Patrik: That’s a good question. I think that open source really matters in infrastructure – to go back to the Oracle example, I think what enterprises, but also governments, should think about when it comes to infrastructure, you don’t really want to rely on something, like, a core piece of your infrastructure, where the provider can come every year and ask for 20% more in pricing, and you have no way to circumvent that.

So, a critical piece of infrastructure is, you shouldn’t have a lock-in closed-source component and the company that can squeeze you behind it. So, for infrastructure, it really matters, and then, how they should procure it – there’s always this software kind of service providers consultancy houses that still are needed to kind of sell it, in this way or another.

Yes, maybe the pieces can be free, but someone needs to help glue it all together, and help serve it, and help support it, and so forth. I think the biggest problem really is not in the nitty-gritty details of how government should buy these things, but actually in having a smart, technical person who is skilled in this kind of software, who knows what they’re doing.

Most things that could go wrong in my mind, when government buys software, be it open-source or closed, is that there’s incredible stupidity in the world – they try to, without really understanding the software, they try to do a long list of requirements. And then, the big software house, with maybe 20-year-old software, is the only one who has the resources to really answer all those zillion questions and come up with a packaging that serves all those needs. And then, it costs a billion to implement. It’s just crazy that these things happen with public money.

How to get the word out that open source isn’t free

Mike: I bet you’re talking about Western governments too, having challenges in operation of IT. How do we help all these other governments who want to implement, and we’re telling them to use open-source, but how do we communicate to them that we also need updates of the software, we need maintenance, and we need innovation in order for them to really make these ecosystems work?

Patrik: I must say I don’t know how to inform them, but this notion of “things should just be free”, yes, it certainly exists out there, and most companies want everything to be free, and the government. However, inside IT people understand that nothing is free – you need maintenance, you need support, you need help, you need people around any software to operate, and run it, and of course, improve it, but how to inform and educate people, it’s a good question that I don’t know how to answer.

Is now a good time to start a software company?

Mike: Last question, is now a good time to launch a software startup?

Patrik: Well, absolutely. It’s just tremendous what opportunities are out there. I think we have only scratched the surface of what data software can do, for instance, in all enterprises, in all businesses, in all functions, and probably in society, there’s processes and functions that can be digitalized, automated, optimized and improved. We’ve only seen the beginning of that. And of course, with hyper on AI, it’s a much broader thing than that.

It’s like everywhere in business, in functioning governments, in society, you can start tracking data, gathering data cheaper than ever before, put it together, build algorithms around it very easily to get intelligence out of it, get predictions out of it, get things optimized and automated – I think it’s very fascinating how the world can be improved in so many ways, using software in the future, and in the near-term actually. I think it’s a great time right now to think about wild ambitious ideas and work to implement them.

Close

Mike:  Patrik, thank you so much for joining and sharing all that experience with us.

Patrik: Thank you so much. It was really a pleasure.

Mike: And thanks to Alexander Izza from Resonance Public Relations. Cool graphics from Kamal Bhattacharjee. Music from Brooke For Free, Chris Zabriskie and Lee Rosevere. Next episodes will be recorded at the All Things Open Conference in downtown Raleigh, October 27th – 29th, 2024. This is one of the best open-source conferences in the US, so please join us if you can.

Until next time, thanks for joining.

Episode 68: Solomon Hykes, Co-Founder / CEO Dagger

Intro

Mike: Hello and welcome to Open Source Underdogs! I’m your host Mike Schwartz, Founder of Gluu, and this is episode 68 with Solomon Hykes, Co-Founder and CEO of Dagger, but also formerly the co-founder and CTO of Docker.

Back in February, I recorded this episode at the Civo Navigate conference in my hometown of Austin, Texas. If you want to hear the latest and greatest cloud native stuff and meet companies like Dagger, you should check out Civo Navigate. It looks like they are planning a US and Europe event each year.

This episode is a little on the long side because we cover some questions, relating both to Dagger and Docker. But if 45 minutes aren’t enough for you, there are a bunch of other interviews with Solomon, so just check the interweb.

And with that said, let’s just cut to the interview, so we can get to the main attraction. Here we go.

Y-Combinator Playbook?

Solomon, thank you so much for joining us on Open Source Underdogs.

Solomon: Thanks for having me.

Mike: I’m just going to dive into this with Dagger. Your approach seems very YC to me – who are the developers in pain that build Dagger?

Solomon: Yeah. Dagger came out of a process of talking to a lot of software teams about their pain, and the pattern we saw emerge was the problem of the deployment pipeline. You know, CICD, everything that happens after your code’s ready. But before your application is live, everything in between is just, painful, painful and complicated. And so, we’re focusing on making it a little less painful, a little less complicated. So, it is very YC, it’s the standard Y-Combinator playbook.

Community Lead Growth

Mike: What is the Daggerverse and Daggernauts, and what do you mean by community-led growth?

Solomon: Daggernauts are people who think of themselves as part of the Dagger community. Community is — I mean, it’s an overused word, but it’s really important to us, community-led growth is our business strategy. And the idea is, if you build a product for developers, and those developers are excited enough about it that they will not only use it, but show up on an online chat server to talk about it, come to meetups to talk about it, write blog posts about it, tell their friends, help each other – they become more than users.

You need a new word for that, so we use the word community, and then we market the product together. Sometimes we sell it together, and we write software for it together, so that that can become a way to grow as a business. It’s hard to do correctly, because you have to be authentic, you can’t fake it. Because a community will only form if there’s actually something in it for them, and it can’t be just transactional – they have to feel valued, respected, but when it does work clicks, then it’s very powerful.

Monetization?


Mike: As I understand it, you have an open-source project under the Dagger GitHub – your trademark – and your strategy is to monetize by selling a maybe cloud-controlled plane or dashboard, where enterprises can really see value. Am I on the right track?

Solomon: Yep, totally. Open-source engine, optional proprietary control plane.

Mike: What is the business value that enterprises are actually seeing, where they decide, “Okay. I want to go with the commercial offering.”

Solomon: Well, first of all, the commercial offering is very new. Our priority is adoption of the engine. If an enterprise can use both, that’s great. If they only use the engine, and they need to take a little more time to evaluate the commercial products, wait for a feature to be available, then, that’s totally fine. We’ve designed it that way.
For example, our cloud product does not have a self-hosted version yet. 

So, some customers do not care, and they’ll buy it today. And others are waiting for this self-hosted version. When we talk about the business value of Dagger, we’ll talk about the whole of the platform, open-source engine cloud for the buyer. Either it solves a business problem for them, as a complete platform, or it doesn’t.

Unique Features

Mike: There’s a number of tools in this area already, who are those super fans who say like, “I know about all that other stuff, but that’s not for me. I really need this Dagger.” Who are those people?

Solomon: Usually, in every software team, there is at least one person who’s the designated DevOps person, they’re the person who will have to fix the build, or make it faster, get the CI pipeline going, sort of support the developers in shipping. And then, over time, as the team grows, you’ll have more than one.

And in larger enterprises, you’ll have entire team, platform team, DevOps team, whatever – those people are typically the people who will get excited about Dagger because it makes their job easier.

Developers we help indirectly, the DevOps people we help directly. The main reason they get excited is because we don’t force them to throw away what they have, will improve the stack they have incrementally their terms – that just makes everything else easier.

Prioritizing Core v. Commercial?


Mike: Is there ever any friction when you’re figuring out how to allocate scarce resources at Dagger, on, “We should work on this core feature that is in the open source, or we should work on some features that improve the cloud offering, which will help us monetize.” And how do you balance those?

Solomon: Yeah, it does happen all the time. In fact, we’re small enough that it happens even within the open-source engine. Also, right now, we’re at an inflection point, where the core product is done – well, it’s not done, but it’s well-defined, and it’s well understood. And we have a community of people who are using it and want to use it more and more, which means they’re reporting bugs, they’re asking for features – there’s an incremental roadmap that we’re executing on.
Meanwhile, we’re building out this commercial product, which is, like I mentioned, it’s much newer, and it needs a lot of work. Resource contention is a problem. I don’t think we’ve found the solution. Honestly, focus is our friend here. For example, I mentioned we’ve prioritized adoption of the open-source engine.

So, to be honest with you, over the last six months, I think we got a little bit too ahead on the commercial products. We approached it like we could build both at full speed in parallel, but then we realized, when we looked at what people were asking for on both sides, we realized, okay, we can build both, but we cannot build both at the same level of priority. So, we’ve changed our text slightly, and we’ve decided to make it clear that we are prioritizing the core open-source engine at the moment.

And with the resources they are left with, we’re developing the commercial product. But we’re narrowing the scope of the features of the commercial product – in practice, that meant going from two flagship features to one flagship feature in the commercial product, for example. I think it’s just mostly being realistic and also being flexible adapting to changing circumstances.

Community v. Monetization?


Mike: I’ve noticed that with a number of start-ups, their initial focus is really on getting adoption and getting those super fans and then figuring out monetization later. I’ve also had guests who said they should have thought of monetization from day one and built around that. Where do you come out on that?

Solomon: I think there’s a balance to be found for sure. For example, I’ve just said, we had to sort of scale back a little bit on developing the commercial product. I’m still very glad we started developing it early, and we’re validating it, whether there is something that anyone is willing to pay for, and also, a million details along the way – how much to charge for it, is it cloud or on-prem, what market are we competing in, who are our competitors.

The clock starts the day you start building it. And I do think completely putting off even thinking about it for potentially years can be a mistake. So, we were pretty deliberate about starting early, but then, once you’ve started, you got to manage your priorities – that’s our approach.At Docker, it was all on community and think about monetization later for sure. And it worked. I’m not surprised when you say some people regret not working on monetization sooner. I completely understand.

Why Trademark is important

Mike: Actually, I would like to dive deeper into the monetization because that’s really interesting, but before we even get there, maybe just any thoughts about the use of the trademark, and how you’re approaching Dagger differently from a trademark perspective.

Solomon: At Docker, we underestimated the importance of protecting your trademark when you’re building an open-source platform. Ironically, the best example of doing that right is also the same company that abused our trademark the most – namely RedHat. RedHat really is a great model for how to be extremely open on your code. Red Hat has always been serious about open-source licenses, opening up even proprietary products from companies that they bought. They would open up the code. They really walked the walk on copyright on the license of the code itself, but the way they made it work is they’ve always been extremely strict and enforcing the use of the RedHat trademark.

It is the best model for an open-source business. The stricter you are on the trademark, the more open you can afford to be on the code. In practice, that means this is open source – you can fork all of it and redistribute it. It’s all fair game. You can take all of it or modify it and redistribute your modified version. But you can’t call it in our case Dagger, you have to call it something else.

Ideal Use of Trademark

Mike: What was the impact on Docker as a result of people, or organizations, using the trademark?

Solomon: The problem was the kleenex problem basically, that Docker washing became the problem. As Docker popularized containers, and containers became the hot new thing, Docker became the hot new thing, and the Docker brand became very valuable. You know, the whale logo, everything associated with it – something new, something exciting. This community that was just blowing up – we had I think 120 meetup groups around the world, hundreds of thousands of people physically coming every week to just show each other a cool stuff they were dealing with Docker.

That brand was basically very valuable, and anyone, any vendor could just take it and say, “Oh, yeah, we do that.” Of course, it meant that Docker as a business did not enjoy as exclusive a competitive advantage as it deserved.
And also, over time, it hurt the quality of the experience, because you could go to a random vendor and they tell you you’re getting Docker. So, you install their thing, and you think you’re getting Docker, but you’re getting some weird Frankenstein product that maybe there’s some Docker inside it somewhere.

But the experience is not up to my standards, or the standards of the Docker team. So, you’re going to walk away disappointed, it didn’t work properly, it wasn’t compatible. They said it would be portable, but it only works on this vendor system, whatever the problem is. And now you’re going to associate that negative experience with the Docker brand. So, losing control of your brand is a really terrible thing to experience. The way to not lose control of it is to enforce your trademark in a very strict way.

Mike: Although Dagger is open source, you don’t want anybody doing Dagger hosting, or introducing a Dagger powered product. What about the hosting side? We’ve heard a lot about open-source data. If somebody used just Dagger hosting, we’ve seen WordPress hosting here in Austin – where are the limits, or where are the boundaries?

Solomon: Our approach is, if you think Dagger hosting is a great thing to do, come talk to us, and we’ll partner. Actually, we’re having conversations now that actually becomes forcing function for designing it. Okay, let’s talk about what does Dagger hosting mean actually. What that experience will be is not as straightforward as hosting a database. There are questions of what’s going to run on the client, what’s going to run on the server – those things are still being worked out. So, now we get a chance by forcing people who want to sell hosting for Dagger to come to us.

Now, they’re part of the design conversation, now these potential partners, we are showing them the architecture we have in mind, we’re showing them the feedback we’re getting from our community what they want.

And now, we’re all going to design this together. And if it still makes sense in the end, and they’re still interested, then they’ll get a license to use a trademark. Or they could just say, “You know what? This is great. We don’t need your brand, we just want the code, and we’re just going to do hosting, or something – they change the name, and then they can host it now, they don’t need our permission for that. Keeps us honest at the same time.

Mike: Because they have the right to use the software.

Solomon: Exactly. Because it’s actually open source. There’s no special license or anything.

HashiCorp shows system worked?

Mike: There’s another company I thought you were going to mention when you mentioned RedHat, and I think RedHat’s also underappreciated. But I was thinking of HashiCorp. They actually did and continue to do a very good job of defending their trademark – TerraForm.

However, they did run into some friction with the community. Because, after a billion downloads of their software, they changed to a non-OSI approved license. And there was some blowback from the community. And I think they broke a social contract in a way. By your definition, they did it right, but is there anything they could have done better?

Solomon: I think that episode with HashiCorp was actually very useful for start-ups like Dagger that are earlier in the journey. We’re telling the markets, “Here’s our open-source product, here is our business model around it. And it’s a sound model, and you can trust us that everyone wins.


RedHat was very useful in standardizing part of that model. Okay, RedHat has opened everything, and they’ve been very strict on the trademark. And they’ve been successful, so that helps us make our case. But one question we can’t answer with only that example is, okay, but what if later you change your mind, what if later the founders leave, and the professional CEO takes over? Or, what if you sell? And you can’t say, “No, we won’t.” I mean, because you don’t know. And that’s what happened to HashiCorp.

And what’s wonderful about this example is that it turned out okay for the customers. Because what’s happening now is, there was a fork, it’s run by a foundation. Now, there’s one more option. After HashiCorp made this decision to break the social contract, basically they were punished or rewarded, depending on how you look at it, with a community-run alternative, which customers can now choose to switch to at any time.

Now, I get to say, well, we don’t plan on doing this, there’s no good reason for us to do that, but just in case, a few years in the future we change our mind for whatever reason, here’s what will happen: we’ll get forked, there’ll be a community-run alternative, and you’ll get to use that. So, it’ll be fine.

Mike: So, the system worked.

Solomon: Yeah. I think the system worked. I have no clue if in the end, this is good or bad for HashiCorp, if they made a mistake, or if it’s not that important. I mean, I don’t really have an opinion on that, but I know it helps us make the case for our model now.

Do we need a new OSI model?


Mike: We’ve recently attended SoCon, the State of Open Conference in London, and Bruce Perens gave a talk, where he suggested we need a new open-source definition. And we see open-source companies moving to other types of licenses that are slightly more restrictive. Do you have any thoughts about whether maybe we need some innovation in this area?

Solomon: I think we do. Honestly, the elephant in the room is this battle over what’s open AI – well, there you go [laughing], that’s the problem right now. What does it mean for AI to be open and what happens when the closed vendor has opened in the name, and then you have open-source models that have a closet says, you know, Apple can’t use it, or something. Neither open AI, nor so-called open-source models today, meet the OSI definition of open source. Is that okay or not, that’s the debate. But I think given just the enormous attention on AI right now, if anything causes the state-of-the-art in this area to change, it’s going to be that.

Every other ongoing point of contention is dwarfed by the focus on AI. That’s my opinion. Maybe that’s an opportunity to leverage that attention and that desire for change. And that those tensions channel it into constructive changes to the standard. It’s dangerous, because maybe what ends up happening is the standard shifts in the wrong direction.

It’s possible that we regress, that we actually instead of getting incremental improvement on the current OSI model, maybe we lose benefits of it that we take for granted at the moment. And on top of all of that, even the definition of software itself is called into question. Like, what if all the software is generated by a model anyway.
So, I’m glad it’s not my job to figure that stuff out. I’m glad I’m not the expert.

Does Open Source Pattern Match with a Tragedy of the Commons?


Mike: A professor here at UT, and we’re recording this at University of Texas, Austin, wrote a paper comparing open-source to tragedy of the commons. She asserts that actually open-source economics pattern matches with some of the things that go wrong in a tragedy of the commons and proposes some remedies.

Solomon: I completely agree with that analogy, but I also think it’s getting applied wrong. Because common use of a limited resource – that needs to be replenished, the land. But software doesn’t need to be replenished – it doesn’t matter if one person downloads it, or 10,000 people download it. The bits themselves, once shipped, are not a resource that needs to be replenished. But the maintenance effort, the support burden is. And so, I think the grazing is not when you download the open-source software and then you use it without paying back. I think that the grazing happens when you’re filing a bug on the GitHub repo, and you’re expecting a maintainer to read it and then answer your question.

Or, you are sending a patch that you really need to see merge for your own downstream needs, and you need a maintainer to review it and give you feedback on it, ideally merge it yesterday. That’s grazing.

So, the resource, the equivalent of the land is not the software. I think the equivalent of the land is the people maintaining it. Because those maintainers behind the scenes are burned out. They are holding the world on their shoulders, and a lot of times they’re volunteers. But even if they’re paid, we’ve had people burn out of Docker because the world showed up, and they all wanted the Docker engine, and they wanted this new thing merge, and they needed this bug fixed, and they all needed it yesterday. And some of them were very demanding.

And some of them had millions and millions in contracts on the line, and RedHat was a culprit in that, for example. And we had people burn out not just from Docker, but from tech. Because they just cannot take it. You know, I have great hair now, and I didn’t when I started as a maintainer of the Docker project. We need to solve that. And I think tragedy of the commons is a perfect analogy there.

How to Maintain a Mountain of Open Source?


Mike: You mentioned open AI, and when I look at open AI, I see that it’s built on a mountain of open source. But they have the connection to the customer, which gives them that sort of last mile that enables them to extract value. What I’ve noticed is that there’s an inexorable stream of CVEs. That, because my software is built on a mountain of open source, some of those dependencies are always getting a rear patching once a month, just to keep up with it.

And I think the expectation for an open-source project is not just that it’s open source, but almost like this social contract that you will continue to patch it forever. And when the next log4j happens, we expect you to do it tomorrow.

 Solomon: I agree. Yeah, that’s part of the problem. It’s like an open-source project is a service. The unspoken contract includes everything you’re talking about – ongoing maintenance, ongoing support, ongoing operations of the project itself, running the CI pipelines for it. There’s no system in place for doing that in a sustainable way. I think it’s an unresolved tension in our industry today.

I just think sometimes we go down the wrong rabbit hole, trying to solve it, because we just frame the problem – it’s not a software piracy problem, the problem is not that people are using the software for free to make money because that’s normal, that’s expected. If someone has to wake up at 4:00 a.m. because a really terrible security vulnerability was just discovered and they have to patch it because only they know how. And they were volunteers, and they have billion-dollar companies depending on it, and the billion-dollar companies are the ones calling them – that’s not okay. That’s like fundamental problem for everybody involved. And that’s an actual real-life scenario. That stuff happens. So, yeah, we got to figure that one out.

Lessons from Docker Monetization Battles


Mike: Yeah. This is a global problem, and it’s really hard to solve global problems. I’m going to switch back to Docker, and again, excuse my ignorance, I’m not a Docker expert, about the history either, but one thing I have noticed is that they seem to be doing a little better now.

Solomon: Oh, yeah, yeah.

Mike: And so, monetization and pricing you mentioned, I wasn’t even going to ask you about your price journey for Dagger because it’s too early. Whatever you think of now probably is going to change. But you do have some interesting perspective, I think, on having this incredibly, maybe epically, record-breaking popularity in terms of who loved the software, but then also challenges around monetizing. And then, also now, the ability to see what they’ve done. And I’m just curious if you had any thoughts on what they’re doing now?

Solomon: Of course. A very common thing I hear is, “Docker struggled as a business because they gave too much away for free.” And I really think that’s wrong, meaning Docker was correct to open source, and it never had to be backtracked. Docker never had to change its license and anything while I was there, and after. At no point did Docker have a regret open-sourcing something – there was no temptation to walk it back. I think that’s a victory.

And second thing, there were many, many opportunities to monetize on top of this immensely popular open-source project. And many companies successfully did that, pretty much the whole tech industry made buckets of money off of Docker, except for Docker, for a long time. And that’s not anybody’s fault other than Docker’s.
Docker failed to build a commercial product that was exciting enough for people to buy. That’s the reason Docker struggled. It was purely an execution problem. It was ours to lose, and we screwed it up. And all this typical startup failure ways lack of focus, which comes from an inability to say no to something, so you can actually focus on other things.

And it’s just lack of discipline on hiring and expenditure and strategy. So, ultimately, instead of shipping one great commercial product, we shipped eight average ones. If you look at the numbers, if you go up to the point where Docker got closest to that, and it got recapitalized and split in two, and the commercial part got sold off to Mirantis.

And the core assets, the brand, the open-source tools, the developer tools lived to fight another day. By that point, Docker had spent 300 million dollars to build a 60-million-dollar business. The math doesn’t work out. So, yeah, that was the main reason what caused Docker to be successful now, is a very simple – they had to downsize, sell off that failed enterprise business, they were left with the open-source Docker engine, Docker Hub and Docker for Mac, these desktop apps install Docker on your desktop. The simplest thing in the world. It just so happens that when we shipped that back in 2016, we did not make it open source. So, we have this really easy Mac application: click, click, you got Docker. We bundled that as a binary, and we did not open source it, and nobody cared. And it was free. And then, one day Docker said, “You know what? This application is still free unless you make this much in revenue as a business, and then you have to pay us now.”


And that was it. That turned Docker into a successful business pretty much overnight. So, not sure what lesson to take from that. The product that we ended up monetizing successfully in what, 2020 let’s say. It had been there for four years. You just had to put a price tag on it.

YC Twice?


Mike: So, you’re a new entrepreneur. Are you going to recommend going through the Y- Combinator?

Solomon: Yes. 100%.

Mike: On your second start-up, did you go through YC again?

Solomon: I did.

Mike: Can you talk a little bit about why? Like, you knew everything from the first time around, why did you do it again?

Solomon: It’s a complicated answer. The shorter version is, I joined a little bit later. It’s three of us co-founders at Dagger. My two co-founders, Sam and Andrea, they were first employees at Docker. They left, and they started something new. And I joined a little bit later. The reason I joined later is because I was taking a break, and I was a visiting partner at Y-Combinator for one batch. If you do this thing, they invite entrepreneurs to be a partner for a little while. At the same time, they were asking me the exact same question you asked me, “Hey, should we join YC?”, and I said, “Yeah, you should join YC totally. You’ll meet new people, you’ll get a lot of help.”, because they were first-time founders.


And they ended up assigned to me, so I was their partner, one of their partners. We spent a bunch of time together at YC. I was a partner, they were founders, and we talked about their idea what to do. And then, we got excited about it together. And at the end of the Y-Combinator batch, I joined as a founder, which means that now we’re a YC company, but like technically, I did not go through YC as a founder of the first time, if that makes sense. I did not have the opportunity to make that decision. But if I had had the opportunity, I would have done it.

Because I had been through YC, but my co-founders hasn’t. So, as a group of founders, we had not gone through it together. And that’s very valuable. And also, YC got better over time. New partners, new programs, new resources, and more importantly, more alumni.

So, the other founders in the batch are some of the smartest people you’ll meet. Being surrounded by other founders and talking about your founder problems with them, and then staying in touch and growing your network that way and helping each other after you leave Y-Combinator. That’s incredibly valuable. I always tell everyone, you should join YC if you can. And if you’re an outsider, or a first-timer, then even more so.
And if you say, “Okay. I’m a second-time founder, I’m not going to do it.”, I’ll still say you should do it, but I understand the way you’re not doing it. If you’re a first-time founder, there’s no reason not to go through YC, if you get a chance.

Advice for Open Source Founder?


Mike: Last question. Do you have any final advice for founders who want to use open source as part of their business model? Just some quick advice.

Solomon: Yeah. I would think of it as a tool in your toolbox, as a founder. It can be the perfect tool, or it can be the wrong tool. It really depends on your product, your positioning in the market, your strengths as a founding team. So, I would not blindly apply a playbook. And I would always make sure that if you’re open sourcing something you know why, especially for engineers, engineer founders.


There’s always a pressure to open-source everything, because you want to be loved and respected by your peers, giving things away. Open source is just a shortcut to that. Sometimes, the right answer is to not open source something to withhold it. And sometimes the answer is to open source. It really depends on the situation. So, be strategic about it, my general advice.

Closing Credits


Mike: Solomon, thank you so much for sharing all this with our audience, thank you so much.

Solomon: Thank you.

Mike: Thanks to the Dagger team for volunteering Solomon, to Alex from Resonance Public Relations for suggesting this interview, and to the CIVO Navigate team for the logistical help with the recording schedule.

Don’t forget to check out CIVO Navigate next year if you want to learn more about cloud native technology.

Cool graphics from Kamal Bhattacharjee. Music from Broke For Free, Chris Zabriskie and Lee Rosevere. 

Next episode, in a slight divergence from the format, we’ll hear from Patrick Bachmann of Open Ocean, in his role as Venture Capitalist that funds high-growth open-source software start-ups, informed by his roles at MySQL and MariaDB.

Until next time, thanks for listening.

Episode 65: Scaling Data Pipelines with Nick Schrock, Founder/CTO of Dagster Labs

Intro


Mike Schwartz: Hello and welcome to Open Source Underdogs! I’m your host Mike Schwartz, and this is episode 65 with Nick Schrock, Founder and CTO of Dagster, a platform that helps companies create data pipelines, which is critical to transform and update data in order to make it useful, for example, to generate reports, content, or other actionable information.
Dagster might not be a blueprint you can emulate. Like all start-ups, there are some hard to replicate serendipity that enables Nick and his team to build this amazing company. But as Machiavelli says, “Great leaders need both – fortune and virtue.” In other words, you need to be good at what you do, i.e. virtue, but they also need some good old-fashioned luck.

But what separates a really successful founders, like Nick, is the ability to harness fortune and virtue and combine it with some deep insights about the market, and turn it into a profitable and fast-growing venture not easy to do.

So, with that said, let’s cut to the interview, and let Nick tell you, in his own words, how Dagster evolves.

Early Career

Nick Schrock: Great to be with you.

Mike: Nick, thanks for joining us today.

Mike: Can I just go back a little bit and ask you to share some of your story about how you ended from going from the University of Michigan Computer Science to working at Facebook? So, that early period – how that happened?

Nick: Oh, I wasn’t expecting to talk about the preface book days. I’ll do the quick version of that. I graduated from Michigan in 2003, and I actually went to work at Microsoft, right out of school. And Microsoft’s a great company, and they treated me well, but… 

And actually, the division I was in was the developer division. And I thought that they were just extraordinarily talented, but at that time of my life, that wasn’t for me, in terms of working at a big company.
I wasn’t actually sure if I wanted to do software anymore, so I went to the London School of Economics for a year, because I thought I might want to go more into finance, or even government service – you know, I was a young man kind of searching around.

But I ended up getting back into software. I worked for a healthcare start-up out of Ann Arbor, which is where Michigan is, for what – 2 and a half years.

And then, I went to Chicago to try to do a start-up. That was very quickly spun down because me and a friend, who had worked in the finance industry, we wanted to do it, but then, it was about 6 months before the financial crisis.

So, that was incredibly poor timing. I spun that down, and actually, turns out a friend of mine, who I knew from Microsoft, kind of heard that was on the open market, and he just reached out and was like, “Hey, I’m working at Facebook, it’s really a special place. You should consider looking at it.”

And I was looking at staying in finance in the Chicago area. And I flew out to Facebook, and it’s just the vibe difference between a place like Facebook and a hedge fund in Chicago cannot be overstated.

You know, everyone at Facebook was young, super excited, idealistic, the office was incredible – there was just all this energy versus all these miserable people working in the hedge fund. So, the choice was obvious from there. And then, off to the races after that.

Why was Facebook so innovative in 2009-2015?


Mike: So, what was it about Facebook in 2009 that made it such a hotbed of innovation? Like, what new problems were they trying to solve?

Nick: The engineering-driven culture there, combined with the actual product that was being built. So, the product grew at unprecedented rates, it was used in unprecedented ways and was data intensive also, in kind of an unprecedented way.

We were forced to kind of do a lot of innovation on the fly in incredibly constrained environments actually, both in terms of resources, timing – you know, we had to get stuff to work. And I think that it is true that those constraints do breed innovation.

And that time of period was interesting because in 2009 – how to put this – we weren’t really taken seriously as an engineering organization, I felt. And then, fast forward say 4 to 6 years, and we were taken very seriously as an engineering organization.
It was really cool to participate in that. And in the end, if you look at the output from that eng org at that time, it really is pretty extraordinary in terms of what systems were built internally as well as what was open-sourced.

Technical Origin

Mike: So, few years back in 2018, after being at Facebook for, I guess, maybe 8 or 9 years, you decide to start a company called Elementl, which becomes Dagster Labs. Can you talk a little bit about how that came about?

Nick: Near at the beginning of my tenure at Facebook, I helped create this team called Product Infrastructure, whose mission was to make our application developers more efficient and productive. So, concretely what that meant is that we build internal frameworks and abstractions for the engineers who actually built the site and the mobile apps to build product.

That team did a lot of great work, and we ended up externalizing about a bunch of that work in the form of open source. So, React came out of that group – I had nothing to do with React, but kind of the people across the hall from me, so to speak, produced React. And that obviously went on to be an extremely successful open-source framework. And then, what I’m personally more affiliated with is, I’m one of the co-creators of GraphQL.

I’ve lived and breathed developer tools for a long time and also seen the impact that open-source adoption at scale can have. So, that was definitely on the mind when I left Facebook in 2017, and figuring out what to do next.

And in fact, I was going around the Valley and talking to companies, both inside and outside the Valley actually, about what their biggest technical liabilities were.
And this notion of data, an ML Infrastructure kept on coming up over and over and over. And I decided to dig into this, and very quickly I discovered that this area kind of pattern matched to what I care about and the types of problems I want to work on, typically the things I like to work on is to share a bunch of properties.

One are just engineers in pain. Like their dev workflow is broken, they have bad abstractions, they’re not productive, and purely because of tooling and abstraction reasons – that actually kind of makes me angry and frustrated on their behalf. And on a personal level, I feel that is really motivating.

Second involved finding – yeah, I like to call it like “a problem that matters”. I like working on really broad horizontal problems that could potentially have impact on millions of developers, kind of core essential problems that matter.

I was data engineering adjacent at Facebook, I wasn’t a practitioner. Data pipelining is extraordinarily important actually. People like to dismiss it as data cleaning, or they are kind of data janitor work, but when I looked at it, from kind of fresh perspective and I really thought about it, I was like, listen, data pipeline, they produce these assets, these data assets that drive all analytics, all the dashboards that you work with, all the ML models.
And if you really think about it, these data assets drive a huge proportion of human decision-making and automated decision-making in our entire society. Who gets mortgages or not, how do we price health care, what kind of news do you see – these are fundamental essential things, and it needs to be built on solid foundations.

And the fact that it – in my opinion – like, it was not built on the appropriate tools and processes, and everyone felt it was like chaotic and out of control all the time, was deeply disturbing. So, things were fundamentally, and still, in some ways, are fundamentally broken in data ML engineering. So, that’s really motivating.

Another thing, another property is that I like working on technologies that are sort of a strategic point of leverage in an organization. GraphQL fits that bill. Because if you kind of can intermediate all client-server interactions with a common software layer that has rich scheme information and stuff like that, it’s like an enormous point of leverage for tooling.

And in the data space, I quickly gravitated towards the orchestration layer because I felt it had the same properties. You know, orchestration orchestrates data pipelines. That means, it invokes every single runtime, it touches every single storage system as a result. And then, likewise, any practitioner that wants to put a data asset or pipeline into production has to interact with orchestrator in some way shape or form. So, a strategic point of leverage, I thought that was super, super industry.

And then last, like some feeling that you have a technical insight that’s novel and interesting, and that’s kind of how we got to this notion of — at the beginning we called it Software Structure Data Sets, but now we call it Software-defined Assets in data pipeline.

And the basic idea is that instead of just writing a bunch of imperative tasks to string stuff together, you instead think about it, you write a software representation of the data asset that you end up wanting to ship to production and be consumed by our downstream stakeholders.

So, that was a very long answer, but I found a problem that kind of checked all the boxes, for what I like to work on. And it’s not just checking boxes – if those boxes are checked, I’m like deeplypassionate about it. That’s kind of how I got here.

Business Origin


Mike: You started working on this problem at Facebook, but then you said at some point, you sort of hit this critical mass of like pattern matching, like you said. And you’re like, “Okay. I’m going to start actually a business. Maybe in Silicon Valley, it’s not terrifying, but it’s a big step.” How did that actually work? When did you decide, “I’m going to start a company.”?

Nick: It’s funny. I’m struggling to recall exactly when it happened, but I knew founding company was definitely something I was very interested in doing. Both in terms of working on a product, but also building a culture, and especially engineering culture.

In terms of company building, that part was very motivating. In a lot of ways, I was talking about how I thought the kind of the output and culture of early Facebook engineering was pretty extraordinary. And replicating the good parts of that in an independent organization was super appealing to me as well.

I think I just started talking to people and my message and the problem I identified really resonated. And then, I was talking to some investors, actually not with the goal of doing a fund raise – it’s kind of funny how it works like that – but there was like, “Nick, you want to look at data pipelining, with your background and, you know, work on something, that we should really think about formalizing this with some capital and a company, so you can accelerate your progress.”

It’s one of those things that almost just kind of happened. And I’m a big fan of, “Be an opportunistic.” It’s also true that from the time I left Facebook, I knew that founding a company had a lot of appeal to me.

Transition to new CEO


Mike: One of the podcasts previous guests, Sytse Sijbrandij, once asked me, “Do you love the product, or do you love the business?” And it’s an interesting question. I think I know were you following that spectrum. And can you talk a little bit about how you came to work with Pete Hunt, the current CEO, and do you have any advice for founders on how to navigate when there’s a pivot in the leadership?

Nick: I might like the business more than you would expect. I obviously – I don’t want to put words in your mouth – but I’m assuming you think I like the product more than the business. Actually, I did a bunch of economics and business in college and then the grad year in LUC, and I thought about doing MBA, so I’m definitely a business-minded. I imagine I annoy our FinOps people because I always like dig in about all the financial metrics and whatnot.

Yeah, we can get to Pete. I knew Pete from the Facebook days. He was one of the co-creators of React. We didn’t work really in-depth with each other then, but we met each other socially and through each other’s work, and really kept in touch for a long time after Facebook.

He wrote a small seed check into the company. We also collaborated actually on some podcasts because we were kind of obsessed with this Facebook engineering culture, and we actually put together a podcast series, Software Engineering Daily, with like 15 ex Facebookers, and we learned a lot about each other during that process.

Pete had started a start-up and sold it to Twitter, and he was working on Twitter. And I was also talking to him on and off about the business. And I was in the market for a head of engineering in early 2022, and Pete and I discussed it. And I was privileged enough to bring him on board. And given his experience, formerly being a CEO of a Dev tools company, he had built a marketing organization, and the sales organization and scaled to $5 million ARR.

I knew he was going to be much more than a head of engineering – I even had super high expectations for that – but he really dramatically exceeded those expectations. And I think, it became very obvious to me that he was just way better operationally than I was, in terms of like the mechanics of management, organization building, managing marketing, managing sales – he had done it before, and it was pretty clear.

I, at the time – just to be transparent – I was solo founder CEO, I moved around the country a couple times, I had 2 little kids. Now they’re 2 and 4, but I’ve also started a family during the course of this journey – I just needed like a co-founder figure to share the load.

Because I didn’t have the time to work on what my superpowers are, which is kind of this cross-product of Engineering, Dev Rel and Marketing I think is where I excel. And the other stuff is like, he could do a much better job with that. So, it just made a ton of sense.

I think I’m very lucky in that I don’t think it’s a repeatable process for a lot of founders to do what I did. Because you need that other human, who you know well, who would have been I think if Pete had his own company at the time, we might have just co-founded something from day one, and had like enormous trust context in – like, the transition to bring him in and then move him to the CEO position was like super smooth. I think it was like super obvious to everyone they knew it wasn’t going to be like this massive culture shift. Because like Pete and I are still aligned on so many issues.

I think the entire team was super excited about it, and the transition was really smooth – no leadership changes, no attrition, the company started performing better. I think it was obvious pretty quickly that that is the right move.

Monetization


Mike: So, diving into the business a little bit, how does Dagster monetize? I see a cloud offering, is there also a license enterprise distribution?

Nick: No. We only do a cloud product. So, just for context for the audience, Dagster is a data orchestration platform. And you can think about it like, you write data pipelines in this Python framework for building data pipelines and orchestrating, meaning, ordering computations and modeling the assets that get produced by those computations.

You can install it open source, and people have deployed that to production – a ton of people, I should say we have thousands and thousands of users – but the cloud product allows us to do a ton of the hosting on your behalf.

Most of our enterprise customers have this hybrid product, where we host the control plane, which you think about it like everything is complicated – the metadata database and long-running processes that monitor things and whatnot. Then, they run their actual compute, it’s their data pipelines and their infrastructure.

So, yeah, there’s a cloud product you sign up for, we can host a bunch or all of the compute. And then, also, we add enterprise features on top of it – SSO, alerting, gobs and gobs of features that generally deal with complexity in the Enterprise that companies typically pay for.

So, that’s our primary business model: you sign up for Dagster cloud, you swipe your credit card or talk to our sales people, and you can have the best experience of a data orchestration platform in the world in our opinion.

Why sell small customers?

Mike: I noticed that Dagster sells to small teams – like you said, you can sign up for like 100 bucks – and also to large enterprise. I’m wondering does the small teams’ business actually add up to real revenue, or is it just a pipeline for enterprise customer?

Nick: I think in terms of what investors care about, and what the long-term trajectory of the business is, we certainly conceptualize it as mostly a driver of pipeline – yes – but a broader adoption as well. So, there’s tons of users that use our hosted product that wouldn’t use our open-source product. And simply because they don’t want to host their own computing infrastructure, which is totally reasonable.
So, I guess, if you kind of boil everything on the business, yes, there is – it is a source of enterprise leads, for sure, but it’s also a source of more adoption, which means more people talking about the product. More people having being passionate about the product.

Because an underlying flywheel adoption is also essential for the long-term commercial success of the company.

I think like that’s the most interesting component of it. It used to be, say 10 years ago, that you’d have an open-source product and you’d be like really pulling teeth to use the commercial or the hosted product.

I think the pendulum is really shifted now, where tons of people wouldn’t consider adopting an open-source technology if it didn’t have hosting options. Just because of the way that the entire world has shifted towards more hosted services, which is I think a win-win for everyone involved.

Pricing

Mike: One of the underappreciated challenges of a tech start-up is how to price your offering. I saw a note on the pricing page about an old plan and a new plan. The new plan’s a little complex – not being an expert, I couldn’t really quite follow it. Can you talk a little bit about the pricing journey and where and why you ended up where you are?

Nick: Totally. I like to say, if building an infrastructure company were a video game, pricing is the final boss. And that actually even undersells it. Because iterating on your pricing model is a continuous process, where you have to make sure that it’s working for everyone involved, that we can run a healthy business and that the customers feel like they’re getting a fair deal in terms of — because in the end, they need to get more value than they paid for.

You are correct to point out that the initial pricing was simpler than the current model. Initially, we started out where we wanted to have like no seats limit and just charge on consumption. I felt that a very fair way of doing consumption was to just charge on the number of minutes your pipelines run.

So, the issue with that – and I think this is a good takeaway for your audience – is that customers have to morally accept the pricing plan. Like, it has to make sense to the underlying way that they think. And the problem in a data pipeline solution, if you’re charging by, say by runtime, is that frequently what you’re doing in orchestration is that you are like calling out to Snowflake or Databricks or some other heavyweight computational system that does all the heavy lifting of the compute.
So, from the standpoint of the customer they’re paying us just to kind of wait for an API call to complete. That shifts the mind of the customer to think of us as just a compute hosting service.
And if you’re just doing that, the value proposition of our product doesn’t make sense.

So, the pricing impacts the way that the customer perceives the value of the product, which is obvious when you say it out loud, but isn’t obvious when you’re kind of in it.

We’ve really stepped back and looked at this – the real value in an orchestration system is in the kind of the control signals and the metadata. Like, concretely, you open up a orchestrator, or our orchestrator, and you see all these fancy Gantt charts of what’s going on, you have a ton of visibility, and then the words that our users often use is, “Ugh! Dagster is like the single pane of glass that consolidates my entire data platform, I have visibility into all this stuff.”

So, that’s where they perceive the value. They do not perceive the value like it’s a hosted compute service. That had the benefit of being simple, but didn’t actually align with the product value that the users perceived.

We switched to charging based on metadata and control plane events that drive our UI. I think the other thing is that for founders in the audience is that you have to have a pricing model that works for sales. And early on, you don’t have enough data to know how much consumption there’s going to be for a customer, for like say the next 12 months. And with the way sellers work, they have to hit their ARR number ― that adds up to their quota, that determines whether they can feed their children or not. So, it’s very important to the sales team.

We had to also add sort of a per seat component that effectively acts as a platform fee for our enterprise customers that allows us to kind of project and forecast ARR that would be appropriate to the value it’s going to deliver to the customer.

You also have to think about the internal incentives and how it’s going to work for sales people, who are reliant on selling your product in order to send their kids to college.

Why Audience Selection is Important?


Mike: I am going to pivot a little bit back to tech for a second, but really more to talk about the open-source community. What’s interesting about Dagster is that it reminds me a little bit about the battle between Perl and Python. They were open-source tools in your area that existed before, but they were a little bit hacky or more challenging.

Can you talk about what are some of the challenges of building an open-source community in an already competitive market, where you needed a lot of features just to get the baseline of functionality? And then, how did you focus on either getting new, or getting some of the developers to switch into your platform?

Nick: You need to make sure that you have an audience that cares about what you care about, and it is very differentiated on that dimension, to the point, where they are willing to take a risk to bet on you, to work around missing features or missing integrations that might exist in a more mature solution. So, identifying that small subset I think is extremely critical.

There’s now, I think, a kind of standard reading for Silicon Valley founders, which is Peter Thiel’s book Zero to One. And he talks about how you start with a small market and then dominate it, and then move on to progressively larger markets. And I think that really, really resonates with me, especially in developer tools.

One kind of approach – and this is kind of the nature of tools that I like to work on too – is that what you can do is pick the audience that you think has the most leverage in the organization. And for us, it’s like the data platform engineer. Like, there’s engineers whose entire job in life is to serve stakeholders who build data pipelines on top of a data platform that they build.

And a huge part of that is setting up a great developer workflow with CICD and testing, so you can actually maybe know if you’re going to break something before you push to production, which is very frequently not the case in data pipeline.

I think our early audience was really people who really got it that testing, and fast feedback loops, and developer life cycles, is like the baseline foundation of productivity. And productivity is just huge in working in the software. Because productivity is not just about doing tasks more efficiently, it’s about making an entirely new things possible.

So, yeah, I guess I kind of went for a field there, but to circle back to the beginning of the question, I think it’s audience selection and being deliberate about that, it’s really what’s important.

Governance

Mike: Recently HashiCorp has changed their license, and I see that Dagster’s published in its own GitHub repo, so you’re under the Dagster repo. Dagster is your trademark. How can you assure the community that if the board decides to sell the company to Oracle, for example, that they won’t change the license immediately? And have you considered moving the Dagster open-source project to community governance and making it safer to use for the future?

Nick: As someone who’s gone through a foundation process for another technology, we moved GraphQL to its own open-source foundation with community governance. I have a pretty deep understanding of the trade-offs here. I think it’s a question of maturity and life cycle. The risk that you said exists. There could be a boardroom coup, and I’m out and Pete’s out, and then, we’re sold to Oracle or something.

By the way, the probability of that is approximately zero, but let’s theoretically do it. And then, Oracle could change the license―that is possible. I don’t think that’s a realistic risk in any sort of near-term.
So, if we had community governance, it would eliminate that risk. However, community has a ton of it overhead. And where does the beginning of our journey for innovating, and we want to be able to move quickly and respond to feedback quickly, build features, have complete control in that way.

And that’s definitely the right trade-off for us right now. Compare and contrast that to the GraphQL story, with GraphQL, we open source the spec, a document that was meant to be very stable from day one, and evolved pretty slowly over time. So, in terms of the technical artifact there, it actually matched like having a foundation process and governance over it made a ton of sense. But for Dagster and the immediate future, we’re having more centralized control, and increased pace of execution definitely makes the most sense to us.

2023

Mike: I’m going to move to a temporal question about 2023. A lot of tech companies struggled in 2023. The Times reported that 3,200 venture-backed tech companies went out of business in 2023. Of course, I don’t know how many normally go out of business, but still it seems like a lot. I was wondering, was 2023 a good or a bad year for Dagster? Did you buck the trend and grow 100%, or did you also feel pressures on budgets from enterprise customers?

Nick: We had a great year. So, not only did we grow 100%, we grew 400%, and our NDR was north of 150%, which means, our existing customers were also increasing their contract sizes. I feel great about the business, especially being able to grow this quickly in this environment. I am also grateful that we didn’t raise round of financing in a wildly inflated valuation, with too much capital in the FED bubble in 2021.

Because, at the time, certainly, it was frustrating – a bunch of my peers were  — you know, all of a sudden, the CEO has a billion-dollar company, even though they in reality weren’t that far along in the journey.

Now, I think a lot of those people kind of are in a pretty tough spot, and they’ve had to do layoffs, and it’s painful. We kind of stuck to our fundamentals there, so, I feel very good about it.

I still think the pain is going to be very real for the industry through 2024, maybe even into ’25. Because, yes, there’s an advantage to raising a bunch of capital too, in that you have a long runway. A bunch of these companies, they have so much cash on the balance sheet, and the interest rates have gone up that their interest is actually a meaningful source of income too.

There are more waves of company death coming in ‘24 and ’25, I guess I’ll put it that way.

But we’re in a great trajectory, and I think we’ve raised an appropriate capital to the progress in the business. And we were able to raise a B in 2023, which was a very challenging process, but it felt great to be able to do that. Not many of the companies were able to do that.

Open Source R&D v. Commercial R&D


Mike: Here’s a question, and it’s a little bit about engineering priorities: you have an open-source project of which your team contributes a lot of code to, and you also have a commercial cloud product. Can you just talk sort of at a high level, from an R&D perspective, like how much of your budget gets invested into your product versus how much gets invested into the open source? And how do you balance those priorities?

Nick: It’s actually hard to tease apart. Because, if you’re an engineer who is working on a feature that will have manifestation in cloud, often you’re kind of spanning the entire stack and like working on the open source, but then also working with some proprietary features. So, it’s difficult to cleave it that way.

The other thing is that we reorganized the engineering, the R&D organization around company objectives fairly frequently. I actually can’t give you a precise number at any point, or historically/cumulatively, about how much we’ve devoted to both open source and the cloud product specifically.

I guess what I’ll say is that we still invest a ton of our eng resources. I would say like 40% of engineers effectively work exclusively on the open source, and then there’s another tranche that kind of spans the entire stack, and then there’s another tranche, like people who work on our cloud platform, and all the DevOps and SRS work around keeping that alive and operational.

I don’t know, I guess you can call 50/50, but it’s actually really difficult to put it even semi-processed number on it.

Dog Years


Mike: Well, it sounds like it’s really been an amazing journey. And I’d like to remind you that it really hasn’t been that long either. Only 2018 doesn’t seem that long ago to me.

Nick: Well, it seems like a long time to me, man! That’s the old joke. It’s like dog years in a start-up, one year feels like seven. I have to pinch myself. I only moved away from the CEO seat like 15 months ago or something. And it feels like a lifetime.

Founder Advice


Mike: We covered a lot of topics, but I guess, my last question is, is there any advice you have for entrepreneurs, who are launching a business around an open-source software, product or project?

Nick: I think one of the things that founders need to think about — I mean, this could be an entire hour podcast about all the advice that I would say, but couple things to think about: one is, know when to go slow and know when to go fast, especially when you’re talking about so-called “one-way doors” in Jeff Bezos speak, where you’re making decisions that are either extremely costly or impossible to undo. Company branding is challenging to change in terms of the specifics of open source and dev tools, API decisions, especially in open source, last forever. You need to be deliberate on that.

And a commercial product, you can actually iterate extremely quickly. So, I think it actually is important to kind of have two cultural muscles. One is much more upfront design-oriented and collaborative with community, and deliberate and thoughtful on API design, but you still want to have that super-fast feedback and development when you’re developing the commercial components to your product that are hosted.

The other thing I would optimize for – if I was traveling back in time and talked to myself – is optimize for getting yourself into a situation where you can have a super-fast feedback loop, with early users and customers, where you still have the opportunity to change things, and do so quickly.

If you’re in a super-fast feedback loop with a single customer, you can make API changes much more easily. And the ideal situation still is, if you are working on a technology internally at a company, where you have access to all the code that uses it, that is just super valuable.

You’re also basically getting a seed round for free, because, often you’ll have people around you, and you’ll be working on it.

So, I don’t think I truly internalize what an advantage that was, to have it done the core R&D internal at a company. Yeah, I think like there’s a little more resistance now to open source the internal tech with kind of — it’s a less idealistic environment these days. But those are kind of the top-level things that come to mind.

Closing Notes

Mike: Well, great. Thank you so much for taking time out of your day, Nick, and best of luck with Dagster Lab.

Nick: Thanks. It was really a joy to be on this podcast. Thanks, Mike.

Mike: Special thanks to the Dagster PR team for reaching out and helping with logistics. Cool graphics from Kamal Bhattacharjee. Music from Broke For Free, Chris Zabriskie and Lee Rosevere. Next episode recorded at the State of Open Conference. Peter Farkas, Co-founder and CEO of FerretDB. Hopefully, I’ll have that out in the next week or so. So, until then, thanks for listening.

Episode 64: API Service Mesh with Idit Levine, CEO and Founder of Solo.io

Intro


Mike: Hello and welcome to Open Source Underdogs! I’m your host Mike Schwartz, and this is episode 64 with Idit Levine, Founder and CEO of Solo.io, an API Gateway and Service Mesh company with a product called Gloo – not to be confused with Gluu – the company that I lead, who sponsors this podcast.
I’ve been trying to get Idit on the podcast for many years ever since I spoke with her at an Open Source Conference in 2019, and finally, her PR agent reached out to me a few months back, and, of course, I agreed immediately.

Solo is not your typical startup journey, it’s sort of a miracle it got off the ground, but once it did, they didn’t waste any time – they’re already breaking 10 million in sales.

To avoid spoiling the story, I should just stop here, so let’s cut to the interview.
Idit, thank you so much for joining us today.

Idit: Thank you so much for having me, Mike.

Did Solo Join an Incubator?


Mike: My first question, and this is sort of a different one, but it’s something I’ve been thinking about, is when you first started Solo.io – which was not that long ago, I think five or six years ago – did you join an incubator and why or why not?

Idit: I did not. I wasn’t even aware that they exist, honestly. When I started the company, what I knew is that I had some “technical” friends that I knew that I can start it, and basically started doing this – the software was more about the technology. So, I needed to learn that while I was raising money, and so on.
Honestly, Mike, I think the first VC that I met, they asked me about a pitch, and I asked, “What is a pitch, what am I supposed to do?” I really didn’t know much, I needed to learn.
I wasn’t aware of a long incubation, definitely not in those days, because it’s not very popular.
I just basically started the company around software and just tried to get some money in order to kind of like bootstrap the company. But that’s basically the things I would do. Honestly, mainly because I wasn’t aware of it.

Mike: Do you think if you could do it again, you’d use an incubator?

Idit: No. Now, I feel that they learn so much from those processes. I think it’s very good if a first founder maybe is not aware of a lot of stuff, that’s really helpful to be kind of like protected by team that has done it before and knows how to help you and guide you.

Today, I think I learned enough of the process, and I’m doing it for a while right now. I made a mistake, I learn from them, so now, I’m feeling that I’m more free to actually do it myself again, if I need to.

State of Company at Seed Funding

Mike: At the time you raise your seed funding, was the open-source project started, did you have any technology, did you have any initial customers or team? Like, what was the state of the business when you closed, let’s say, that seed round?


Idit:  No, there was nothing, honestly. Before that, I worked in the EMC. Part of the EMC, my job was to basically do cool stuff on open source. I was in business, I was in the city office, and my job was to basically, if I had a new technology and we had to figure out how we can play that. Basically, we did a lot of open source and invent development. We immediately knew that we were playing back then, in Kubernetes, Mesosphere and Mesos, and all that great kind of technology. Docker was just a new thing back then, so, again, playing in that ecosystem was immediately a thing that we’ve done.

When I started the company, there were two things that I started pitching in the beginning. The first thing that I was pitching was unikernel. It took me a few months to understand that that’s something that I would not be able to ever raise money on. Probably for good reasons.

By the time we were at home, I was pretty bored, so I built another open-source project called Squash. And that was an open-source project that related to debug microservices in Kubernetes.

And that was relatively successful project, but mainly, as I said, I think that there is a good money on it because the work that I was doing before in the open-source, I literally built a reputation of someone who is capable of doing a cool project.

How Many VC’s Pitched?


Mike: How many VC’s did you pitch in your initial seed funding round?

Idit: Oh, man, a lot. I mean, as I’ve said, again, you remember, I was on the east coast, but once I decided to do it seriously, I left the EMC, and then, I basically went to the west coast, where there is VC that is more in that space and that, yeah, I got a lot of those, a lot. I think like every founder as well.

Products?


Mike: I don’t want to go too deep into the tech, but when I look at the Solo website, I see there are a few products. I am wondering if there’s like an 80/20 rule here, where one of the products accounts for 80% of the revenues?

Idit: We don’t have 20/80, actually, that’s interesting. I think it’s probably 50/50. And the reason is because of the packages, a lot of time we’re selling them together. If you look at all the projects, the main two markets that we’re going after is, the Gateway and the Mesh market. We started with a Gateway mainly because the Mesh wasn’t — you know, we couldn’t sell it.

So, we started from the Gateway, and we knew that this is kind of like an entry point and kind of like a stepping stone to a Service Mesh, so that felt very in the area. And I believe that in the future the Mesh will grow more.

First Customer

Mike:  So, one of the challenges of a start-up is always the first customer, especially if you’re selling in the Enterprise space. How did you convince this customer to be first? What did they actually buy? And whatever they bought, does that resemble your current offering today?

Idit: Yes, actually, as I said, we started selling the Gateway, and that was a flagship product of the company. When we started, basically what we did is, we had three design patterns in a way. I didn’t do it the regular way, we did it from open source. We didn’t go and talk to customers and say, “What do you want us to build?” And then, we built it. We were more like, we’re in the open-source and kind of like say, “Okay, that seems like the right thing to do.”

Kubernetes came, you needed a new API Gateway, you wanted probably an Envoy – that’s what we believed people wanted – and then, we went to pitch.  And a lot of those customers came to us from the open-source community.

So, we learned a lot from that process. What we did, and we did it differently, because we are coming from open source, we basically managed all our relationships with our customers through Slack. Then, understood what we need to do in order to make that very, very successful in their infrastructure. And we basically got all those requirements and built them into the product. It’s very different to build an open-source project versus an Enterprise environment.

Value Prop


Mike: So, what would you say is the most important thing that motivates your customers to buy your product?

Idit: I think that today Solo is kind of like three things that we are very good at. Number one is, we really, really understand the marketing really, really well, and the technology in it, so we know what’s coming up. We know what is 20 and what is not, we’re looking at adoption – we really understand that very well.

So, we always compromise with the customer that we will bring them to the edge of the technology. If there is a new technology that is relevant, we’ll probably put it in our product. I think that’s one thing that customers like, so the perception of Solo is that it is an innovative company, which it is – it’s what we are.

The second one I think is customers in sales, which was always one of the things that is the most important to us. This work with Slack, when I started it, everybody told me that’s not going to scale. And surprisingly today, when we have hundreds of customers, it is still scaling, and the technology itself, if you look at it right now, there was a lot of shifts in the market in terms of the infrastructure that you’re running, most likely running in something like Kubernetes.

So, it makes sense that you would have a Cloud native Gateway, and when you start scaling and scaling and scaling, it makes sense that you will take care of something like MPLS and Security and Zero-Trust and Observability, and all those microservices – it’s just that this is the needed technology when you are going to scale. And that’s where the market of microservices like Kubernetes is right now.

Is Solo a Distribution of existing Open-Source Components?


Mike: Solo is an interesting company in that, in a way, you write software, you write a lot of software. But you also have a curated distribution of open-source components that you give your customers a control plane to manage and take advantage of. So, it’s not just the software that you’re writing, but without Envoy and without Kubernetes and without Cilium, you really maybe couldn’t even build a product. So, do you think that maybe this is a new model, where you add a little software on top of this huge curated distribution of other very complicated components?


Idit: Instead of creating the open-source project – we do have one, for instance, Gloo Edge is a technology that is an API Gateway based on our technology, and it is based on Envoy. I think that what we were good at was identifying, pretty much at the beginning, which of those technology would be better on Envoy, when honestly Envoy was relatively a very small community no one really knew about it, and NGINX was the chosen proxy.
We chose Istio, even though we could have competed like everybody else and tried to build a better service mesh, but I knew that that will be the choosing mesh, even though when we looked at it, it was pretty messy, and we knew that it would take you a while to get there.

I was very, very aggressive to my team saying we are not going to be competitive, we are going to use that.

And the reason is because the software that wins is not always the best software. It is the software that most people are leaning to because they will make it eventually the best software. And I think that that was something that Solo has recognized very well. All that technology, all those products that we are doing is basically we are building – and I will not say a little – we are building quite a lot of logic, ease of use and enhanced technologies on top of those — let’s call it basic component that you need.

There is a lot of complexity actually in the control plane, way more than in the data plane, for instance. But, yeah, as I said to you, this is my model, hopefully sellers will succeed with it, but yeah, I believe that open source is building an amazing technology, and that we should leverage the best.

We are also contributing a lot of those technologies. I mean, if you look at the Istio right now, the new thing that we did with Ambient that we and Google contributed to it, it’s mainly we are the main contributor to it. And Istio, we are contributing a lot to it, we have a full team that is responsible to contribute to it. If you look at this, probably I think the most engineers that are working today on Istio are coming from Solo.

How to Decide What Features Are Open Source?


Mike: I was looking at the open-core model, but I’m actually more curious about, there’s always this friction between what do we put in the community version and what do we open-source. What’s the decision process behind deciding whether a plug-in will be commercial or non-commercial?

Idit: In the beginning when we started, we had nothing, we put everything in the open source, but then at one point, we understood that that’s a problem. Because eventually, somehow, you’re not going to exist as a company if you are not going to make a little bit money at least. So, we needed to figure out that what we’re putting on to double it will make sense, we are not hard to open source because it’s very important to us that open source will be successful.

It’s why we continue contributing constantly to the open source, but we also need to make sure that we will have something that differentiates it on top of it. And the decision in the beginning when we thought about it, the Enterprise feature that people actually really, really wanted to have a provider helping them was security or stuff that will let that do. You know, Enterprise feature like HA.

So, that’s the stuff that we put in Enterprise. The question is, you are usually around technology, would it make sense to be in the core open-source project because that is where it belongs. It’s kind of like a core feature, or it’s actually an extension to that open-source project.
And therefore, it’s going to be that Enterprise edition. To us, it was very important that the core should be open. That’s the way we’re doing it.

Pricing

Mike: I always worn entrepreneurs that pricing is one of the most challenging aspects of a tech start-up in particular. Can you share maybe some of the lessons you learned about how to price in the first few years, did you get pricing right initially, did you have to do a major pivot – what was your experience there, and do you have any lessons learned in pricing?

Idit: As I said to myself, okay, maybe the real unit of contribute for instance in the Gateway is supposed to be the API call, but honestly, that will take a lot of time for me, and it’s also going to be a pain for my customers, so how can I still value how much they use it, without actually interfering too much with the customer and with my engineering team.

And what I came with in the beginning is that the data plane is usually a good assumption, because if you have a lot of call, you’d probably want to scale that data plane. And in the data plane, it’s easy to call, the customer tells me I have five clusters, this is a data plane I am using – it is very easy to measure it and if people use it more, that’s fine.


So, that was the beginning. When we added the service mesh, there was a way more data plane and there was also a way more potentially change. Because you have cycles, and the cycle is basically going directly with the application, the microservices. The microservices going up and down, so very hard to basically figure it out. We needed to change that model and we went to the cluster model.

We said, just let’s keep it simple, we don’t want it – again, it’s all about keeping it simple. That’s what was important to me. I don’t want my customer to need to have a PhD in order to understand the way we were pricing.

That’s what I did. And again, it’s probably cost me some money. I probably left some money on the table and that was fine. But again, it was all about and it is still all about Solo as the partnership. It’s all about the relationship that we have with our customers, it is a real partnership, we are seriously the extension of their team.

But, you know, stuff changing all the time, so you always need to adjust. And honestly, you are learning that from your customer. So, for instance, what we saw right now is that some of the customers that are basically using us, it is more like advanced development center kind of thing.


Innovation centers like city offices or the innovation center on the ITN, and when they are starting, usually what they want, their job is to basically build something to offer to the businessmen. So, the question is, the money is not going to come from them, you cannot expect them to have tons of budget to pay you to run it.

So, what they really want is more of the consumption model. What they want is to create something and get the platform available everywhere, without paying millions of dollars, but then, they will basically enable teams to come after. And that’s different. The model should be different, it can be how much clusters you’re running. Because it could be that you’re running an empty cluster in the beginning. So, we needed to adjust based on the customers. So, it’s always moving kind of like we are learning from the customer how we can make it better.

But again, to me, the way I’m looking at this and that’s always my motto – whether it is truthful building, writing software or selling product – I want to take the challenges on my team. For instance, I prefer right now to build a sophisticated metering that will make the best customer end-user experience for my customer, even if it’s harder.

How to Maintain High Growth

Mike: You know, I was reading an article, and it said that you were projecting five to six times growth for the next year, what is a key to obtaining this high rate of growth? How is that possible?

Idit: First of all, the market – and that’s very, very important. Like for instance, when we started, we had the Gateway that was very popular and everybody needed it, and then the Mesh came, but it took us a while until Mesh would be everywhere. Right now, there is a lot of stuff that is going really, really well for us, and that’s what is allowing us to go.


What number one is, for instance, that is still going to the graduation. So, we actually choose the right service mesh, and not only this, it is going right now to graduation which has shown maturity.
So, that by itself means that there is more demand from the market. You just need to have the right market product to sell, and when a customer wants it, it would be really lazy to grow. But I’m not going to say that there are no challenges, in economy, it could be that we have an amazing product, we have tons of money – that’s not really helpful if our customer doesn’t have money. They’re not going to buy it. Again, that point – you need to make sure that the product is a necessary, that people will need to spend money for it.

Just, again, listen to the market, make sure that you have the right market fit, which I think is the most important, thinking about the packaging, make it very, very easy for people to consume your product.

Metrics and Data?

Mike: You’ve mentioned that you’re data-oriented, and I’m wondering, what are some of the most important metrics that you track?

Idit: This is a good question. I mean, if you ask my CFO, who is a very, very data-oriented person, a lot of the metrics that is running is metrics is numbers – how many VCs we are doing, how much of it is in production and that kind of stuff. Data that I’m looking at is different than the data that my CFO, the metrics that they’re looking at. I think in every business, it’s all about people, it’s all about the people in the business, it is all about the people in the market. Why has AWS decided to do this, why has Google decided to do this, what’s going on inside this organization – all this information is not metrics, but it’s data that you need to collect in order to make the right decision.

How do I predict it five or six years ago that there is going to be a lot of clusters and that people will need a service mesh for each and Istio will be that service mesh. That was pretty crazy to do five years ago.


But I had enough data that would lead me to believe, a lot of data that would lead me to believe that this is the direction that we need to go. So, we do have the metrics of how many customer success, otherwise you cannot scale – you need to know when something is wrong and, you know, big enough organization right now that “I’m not everywhere and I don’t know everything anymore.”

What Gives You Joy as CEO?


Mike: What gives you the most joy as a CEO?

Idit: It is always your job to basically kind of like try to cover the gap that you have in the company. As in the beginning, we had engineers, but we didn’t have anybody to do evangelism, and kind of like after that, we grow, and then we got that evangelism, so I’m not doing evangelism anymore. You are always doing more stuff, and to me, the way I’m looking at this, honestly, when I’m waking up in the morning is, what is the next fire that I need to put off, like where do I have a problem with, what is not working well the way it is working right. It is seriously like that’s how you should think about it – where is the next fire will come from and how am I covering it.

And to me, I’m a person that is easily being bored, so, I like learning, I like seeing what the problem is, I’m dangerous in every position in the company, potentially. I’m dangerous enough now after six years that I learned all of those.

So, I think that, the fact that it’s never boring, but I wish it was a little bit more boring. I mean, I heard a joke from someone that said, “A founder that started a company in the last five years, what did they need to overcome?” We needed to overcome Covid, we needed to overcome the SVB with the Silicon Valley Bank fall, we needed to overcome the fact that all our competitors suddenly could have raised 100 million dollars, you know, like crazy variations with seed money.

And so, there was a lot to overcome since then and it is never boring. And I think that as someone that likes challenges, that drive “I want to be the best, I want to win.”, so, that’s what I’m enjoying.

And I’ve got an advice from Diane Greene, who was the founder of VMware. And she was one of the people that started Google Cloud, so, one of the feedbacks that she gave me when I started. She basically said to me, “You can decide which type of CEO you should be.” Keep the stuff that you really like to do or you really feel that you’re a huge differentiator. And my guess is, it is that technology is the strategic, that is my strength.

And bring strong people next to you to cover the stuff that you can give away. So, my advice is to go to market. That to me is kind of like the way I’m looking at this, but honestly as a CEO, you really do a lot of the stuff that you don’t want. I mean, your job is to fix the problem or to cover stuff and to enable the other teams. If I need to help my engineers, I will do that if I need. You know what I mean? I will do everything I need to enable the team base. That is I think very important.

What Advice Would You Give Yourself If You Could Go Back in Time?

Mike: If you could go back five years or six years and give Idit some advice, what would that advice be? It doesn’t have to be at the very founding, it could be in the early stages too.

Idit: Wow. I learned so much. It’s very challenging to run a big team and make everybody aligned. As the company’s growing more and more and more – that’s become more than another. I think that the advice that I would tell my younger Idit is basically, just follow your instincts, listen to people, but eventually, make your own decision. I think the thing that I was doing wrong in the company was, a lot of times, I’d hire a leader for market and he’d go to market. And I knew that this is not my strength.

So, even though I didn’t believe always that what they thought were doing is wrong, I let them do it because I said, “Look, they are the expert. I’m not an expert in marketing, so let them do this.” I paid a big price for it because I felt that actually a lot of times, they were wrong and it’s within the company.
So, I think that what I learned today and why I think that I would be a better leader than I was back then is because I’m going to die or succeed on my mistake, honestly. Because there’s nothing faster than us to come and take responsibility for someone else’s mistake.

Again, it doesn’t mean that you’re not going to listen, but after all the data at the beginning, if you believe, like trust your instincts, don’t assume that someone else knows your business better than you. I think that this is something that I made a mistake a lot of time, actually multiply times. Before I said, “Okay, that’s it.”

Close

Mike: Idit, thank you so much for sharing all that experience and know-how and best of luck with Solo. Although it doesn’t look like you need it, you look like you’re doing amazing, so, congrats.

Idit: You always need more luck, but thanks.

Mike: Special thanks to Idit and the Solo team for reaching out. Cool graphics from Kamal Bhattacharjee. Music from Broke for Free, Chris Zabriskie and Lee Rosevere.

Next episode’s expected Jan of 2024, an interview with Nick Schrock of Dagster. I’m slowing down a little bit, but I’m still trying to do four episodes a year.
Don’t forget the State of Open Conference is returning to London, Feb 6th and 7th. So, until next time, this is Mike Schwartz, and thanks for listening to Open Source Underdogs.

Episode 63: EBPF Networking Isovalent with Liz Rice – Chief Open Source Officer

Intro

Mike: Hello and welcome to Open Source Underdogs! I’m your host, Mike Schwartz, and this is episode 63, with Liz Rice, Chief Open Source Officer at Isovalent, the software startup behind Cilium, an eBPF-based Networking, Security and Observability project. 

This episode was recorded in early February at the inaugural State of Open Source Conference or SoCon, which was held in London at the QEII Center in Parliament Square. The force of nature behind SoCon was Amanda Brock, CEO of Open UK and editor of the essential book Open Source Law, Policy and Practice, 2nd edition. Check it out on Amazon if you’re an open-source founder. Don’t miss SoCon next year in 2024, especially if you’re already in Europe for FOSDEM.


If you think eBPF or enhanced Berkeley Packet Filter sounds like a geeky low-level technology that you don’t need to know about – well, you’d probably be wrong. It enables developers to safely write code that runs in the Linux kernel. And safely is the key word here, because if you crash the Linux kernel, everything on the whole server goes down, all the containers, and everything else running on that server.


However, by exposing the power of the Linux kernel, developers can write code that runs faster and consumes less energy, and faster and cheaper has always been an attractive feature. Cilium combines three products into one. It’s like an old-fashioned firewall, an API Gateway and Wireshark, and it’s Kubernetes pod aware. It’s used by a number of successful products like Teleport for access management or Solo.io Service Mesh.
Simply said, eBPF is going to fundamentally change our infrastructure.


I met Liz at the SoCon conference, and after learning a little about Cilium, I was really impressed, and I asked her if she would come on the podcast, and luckily, she said yes. So, here we are with the interview.

Mike: Liz, thank you so much for joining me today.

Liz: Thanks for inviting me.

Tech Overview


Mike: As I understand it, Isovalent leverage’s a kernel technology to build a product called Cilium Enterprise. The upstream Cilium project on GitHub has over 22,000 commits and 14,000 stars – these are really impressive numbers for a project that started in 2016. How did this happen and how does this relate to the origin story of Isovalent?


Liz: Yeah. So, Cilium is built on a platform called eBPF, which is the kernel technology that you referred to. And eBPF allows us to run programs that are triggered by events that happen in the kernel, and those events could be Network packets, they could be a system call being made by user application – pretty much any sort of event in the kernel can be used to trigger an eBPF program.

Cilium was the first networking project to take advantage of eBPF. And it was always designed with the idea of container networking in mind. And the folks who started it are the founders of Isovalent, as well as being the originators of the Cilium project. So, Thomas Graf, Daniel Borkmann, who’s a kernel maintainer looking after eBPF, within the kernel.

And eBPF and Cilium, particularly eBPF in Networking and Cilium, kind of grew hand in hand since 2016 thereabouts, as we – the many, many contributors to the Cilium project – as it grew and as it gained functionality, sometimes that’s required additional capabilities in eBPF.

So, it’s been really almost like a long game. I think when Daniel and Thomas and Dan, the CEO, when they were first thinking about using eBPF, it was such a cutting-edge kernel technology – nobody was using it in production.

You know, when we add something to the kernel today, people won’t be using it in production for probably three, four, five years to come, so really, anticipating what the future was going to be.

I first saw Thomas presenting Cilium and the underlying eBPF technology back in 2017, and at the time I thought, “Well, this is revolutionary, this can change so many things.” Because not only can we see Network packets being manipulated by eBPF programs, we’ve also got this incredibly performant way of observing those Network packets and reporting on them that we can use for observability tooling. And like you mentioned network policy – we can implement network policy in eBPF.
Just making policy decisions about whether an individual Network packet is permitted or denied by policy, based on Kubernetes identities – this is the other real strength of Cilium.


It knows the Kubernetes identities, the labels of every pod. And so, you’re no longer just looking at network flows in terms of IP addresses and the port numbers you’re actually looking at them in terms of “this is a flow between service X and service Y.” It is so much more meaningful for a Kubernetes’ user.

Why the name Cilium

Mike: Just out of curiosity, do you know what Cilium means?

Liz: I think they’re little hairs in the inner ear – I’m not entirely sure why that was used as the name for the project.

Origin


Mike: I understand the eBPF technology is mind-blowing – Cilium is quite a project as I said. I mean, you’re not one of the co-founders, but do you know anything about how did it become actually a business?


Liz: I think pretty early on, as Cilium, the project, was getting established, and this sort of understanding that eBPF was going to be a really great foundation for efficient networking. That idea of building a company around this technology was probably in Thomas’s mind right from the get-go – I don’t know that for sure, but I imagine it was. And he and Dan Wendlandt, who I mentioned earlier – this is Thomas Graf and Dan Wendlandt – Dan had the background in software-defined networking, he’d worked at Nicira.


And I think they really saw the future of container networking being built on eBPF, so it was kind of natural to build a company. But, for the first few years, really the focus was on building the Cilium open-source projects, getting that really well-established and really well-known in the Kubernetes community.

It’s now been adopted by the CNCF, so we’ve actually contributed the project to CNCF, we’ve recently applied for graduation status there. It’s probably the most widely adopted in production networking plugin for Kubernetes now.

That kind of path from open-source projects, we really need to see this widely adopted, and then, a business that can provide, not just support, but also some Enterprise features that really large adopter is going to need. And just makes a lot of sense.

What does a Chief Open Source Officer do?


Mike: Your title is Chief Open Source Officer, and that’s a title I’ve never actually heard before. How is that role defined at Isovalent and why were you so excited to take on this mission?

Liz: It’s a particularly interesting title in a company where the vast majority of the engineering is open-source engineering, but I don’t run the engineering teams. My role is much more about how do we continue adoption of the open-source project, and how do we interface with the foundations, the community – I do a lot of work with the CNCF as well. How do we both act as good citizens towards that community and do the right thing in the open-source world. But also make sure that we’re taking advantage of everything we can.

You know, foundations like this offer us a lot of roots to speak to people who might become users and how we can do that in a way that is beneficial for people who want to learn about Cilium, or who want to learn about eBPF. So, that kind of educational role also falls within my team.

Open source v. Enterprise

Mike: This may sound like a silly question because Cilium was so powerful, but from a business perspective, what would you say are the main value propositions of the software?


Liz: So, from the open-source perspective, it’s a highly performant networking solution with built-in observability and security features. And we could dive into more details on what those are. From our perspective, it’s fantastic. If people are satisfied using the open-source version of the code – that’s great – we never want to make it such that — we don’t want to curtail the functionality, so that it always wants to be useful to open-source users.

That said, there are some features that particularly larger Enterprises are particularly interested in that you won’t need if you’re not a big Enterprise. So, for example, integrating with Legacy workloads. Some high availability features that you don’t really need unless you’re at a certain scale – those are the kind of features that we provide in the Enterprise distribution at Cilium.

Isovalent v. Sysdig?


Mike: Do you see yourselves competing with a company like Sysdig?

Liz: On the security front – yes. There is an element of competition there. I think we’re sort of speaking with slightly different customers there. Because, to my understanding, Sysdig is very much a security focused solution, whereas Cilium really applies more to a platform team who’s establishing, I would say Networking first, with this incredible set of security capabilities that you can then show to the security team, these amazing capabilities that they’ll get all that they already have by using Cilium.

I think we’re probably talking to different people within our respective customer organizations, but there is a certain amount of overlap around particularly the kind of runtime security, which we have a sub-project of Cilium called Cilium Tetragon. And there’s the ability to create profiles for the kind of things like accessing sensitive files or running certain executables, privilege escalation, suspicious network activity – these are the kind of things that we can detect at runtime using eBPF.

Why contribute project to the CNCF?

Mike: You mentioned that Cilium was contributed to the CNCF. What was the reason you brought the project to the CNCF? Also, what does that mean for the governance of the project?

Liz: It’s a big step to contribute a project. Because we hand over the intellectual property to the CNCF. That is something that Isovalent used to own and no longer owns. And the governance of the project really needs to be in the hands of the community. So, Isovalent remains the most prolific contributor, but – and this is again part of my role – encouraging more people and more organizations to get involved in not just code contributions and not just documentation contributions, but also the kind of broader evangelism of what Cilium is and the advantages of Cilium.

So, yeah, we’ve really embraced that community. And I think the phrase that we’ve used internally is “paved the world with Cilium”.

And the best way to pave the world with Cilium is to give it to as many people as possible, and the CNCF gives us a really great route to reaching all those people who are using Kubernetes. It gives those people confidence that it doesn’t matter what happens to Isovalent, the Cilium project is in the hands of a much, much bigger organization at this point.

And then, you know, that subset of people who are using Cilium, but then, find themselves needing Enterprise features. We won’t necessarily be the only Enterprise distribution, but there’s no doubt in my mind that we have the greatest expertise. So, hopefully, we will be the obvious choice for someone looking for Enterprise features or Enterprise support agreements around Cilium.

Trademark


Mike: This actually leads into my next question, which is that CNCF actually owns the trademark for Cilium, but your product, the Isovalent product is called Cilium Enterprise. And so, hypothetically, another company could make a product called Cilium Pro. I mean, I looked at the contributors and I went down eight contributors, they all work for Isovalent, I didn’t have time to go any further, but, obviously, your company has a lot of expertise, but still, the prospect that company spent a lot of money defending their trademarks, I almost never heard of anything like that – is it sort of terrifying, though?

Liz: I mean, at one level, yes, it is kind of terrifying. And Cilium is a brand name that is better recognized today than Isovalent is. And that’s a challenge that we have to embrace. And there are rules around what you can and can’t use – I think that there are probably still a few instances of documentation and use of the word Cilium, which we’re not really allowed to do any more, that we haven’t managed to tidy up everything.

There’s limitations on what you can and can’t use around a name based on what is now a Linux Foundation trademark. But everybody understands there’s a transition between us having a trademark and then giving it to the foundation. It obviously takes a little while to tidy up all that options around that, yeah. So, Isovalent Cilium Enterprise is the Isovalent distribution of what is a CNCF-owned community project.

Outside Contributors


Mike: I mentioned that there’s a lot of Isovalent engineers who are contributing code, but are there other engineers who are also contributing?

Liz: Absolutely! Google is quite a prolific contributor, Cilium is actually used in Google’s Dataplane V2, we have maintainers from Datadog, again a huge adopter who has been using it. Enormous scale – there’s some really good talks from Datadog talking about the scale of which they’ve deployed Cilium, we have contributors from Palantir.
Yeah, there’s several what we call committees, so maintainers of the project, who come from lots of different organizations. And then we have – I think it’s around 700 contributors in total. Isovalent today is just over a hundred people. The contributor base is much, much wider than just Isovalent. That said, we probably have the largest group of people working full-time at Cilium.

Market Segmentation?


Mike: On the commercial side, for infrastructure, the marketing is very horizontal, but have some natural segments worked out in terms of the customers who convert from open source to a commercial relationship with Isovalent? And are you figuring out that there’s any ways to segment the market here or the messaging?


Liz: I think that’s something we’re learning – I have just mentioned that we’re about a hundred people now, so we’re growing in our capabilities for how we target different customers and different verticals. We’ve had a lot of success in financial verticals media, quite a few transport, strangely enough. Yeah, so there’s a pretty wide breadth of Enterprises who have adopted this. I guess, the prerequisite for nearly all cases is that there are Cloud Native Kubernetes users, or that we do have some users who are using Cilium in a standalone load balancer scenario.

Have we figured out how to market to all of these different types of businesses? We’re absolutely still evolving and learning. But I think the fact that we’ve for many years had this very community-based focus, a very community-based approach, means that we can establish relationships and have trusted sharing expertise on a technical level that then encourages those engineering teams to recommend us internally.

And when it comes to making a choice about an Enterprise product or whether they need commercial support, those engineering teams already know who the experts are, and have potentially already had help from our team through the open-source community.

Team Location


Mike: Is there an Isovalent headquarters office where engineers go in, or is everyone like spread around the world?

Riz: We are fully distributed. We do have offices in Zurich, where Thomas is based, and in the Bay Area, where Dan is based. And I think that the timing, you know, really around the pandemic, just at the point as Isovalent was growing was sort of around the same time as the pandemic hit. So, inevitable that we were going to be remote based.

And as people have joined, they joined from countries all around the world. We have people from as far as long as Japan, or Alaska, Australia, throughout Europe and across the U.S. So, our team is really now fully distributed, and the culture has to embrace that. So, we’re very much focused on being remote first.

We do get the team together, and we try to get the whole company together, at least once a year. And we have a lot of encouragement around getting teams together in what we call hive time. Because we’re all about kind of bee-related metaphors.

Monetization: What features are enterprise?

Mike: I’m curious about monetization. It sounds like it’s open core, and what are the extra bits that you’re offering, I guess, in the Enterprise? And how do you decide what to make open source and what to add as an extra feature in the Enterprise distribution?

Riz: I see that the term open-core can sometimes come with a bit of a negative connotation. Sometimes people think of it as an open-source software that’s got some kind of, you know, been cut off at the knees, and that’s absolutely not what we believe in.

We absolutely believe in the open-source product being genuinely usable, and there are some pretty large organizations who continue to use just the open-source version. The kind of things that people will come to us for will be — there are some high availability features, there are things like BGP support for connecting into your legacy data center workloads, some Telco specific protocols that we’ve worked on – we very much don’t want people to feel that there’s something that’s core to their basic use case that they can’t do with Cilium.

Unless they are big enough that they’re the kind of organization that wants to pay anyway. You get to a certain size of organization, where you really don’t want to be just relying on open source with no sense of who’s going to support it when anything goes wrong. And they may come to us for features, they may come to us because they just want to know that somebody will be there to help them, you know, with a contract in place, should anything be needed.

Features for Growth


Mike: We mentioned that Cilium is a really broad product. Is there one particular product feature that you see driving the most growth, going forward in the next couple of years?

Liz: That’s a really great question, because we do have you know really, really powerful features in a number of different axes. So, for example, we just did a partnership with Griffon, where we’re building some really great dashboards, again a big part of this is available, completely open source.

There are also going to be some additional Enterprise features here. Perhaps the thing that strikes people is that they get this amazing visibility. And you know, that could be the moment when they realize, “Huh, look at the power of Cilium!” And the fact that we can see all these latency metrics or security information being shown in a visual way. So, that could be one thing that really drives growth.

It could be Service Mesh. We have a very efficient approach to doing sidecars Service Mesh in Kubernetes. Service mesh is one of those features that when it first started being talked about in probably 2018 – huge hype, huge excitement – the reality of people adopting Service Mesh, they found that it’s actually quite resource-heavy, there are issues, instrumenting all your workloads with these Service Mesh sidecars.

I think some of the realities of deploying Service Mesh had not quite lived up to the initial expectations. And then, last year, we announced the sidecarless approach that Cilium can bring. And mostly through the power of eBPF, it’s incredibly efficient. We can shortcut a lot of the path that a network packet has to take through the Service Mesh.

So, I think that’s another area that can be a real driver for growth, as people realize they can get all the benefits of Service Mesh, but without the overhead that they’ve come to associate with it.

And then, finally – security. I think I mentioned earlier the runtime security tooling that we’re able to provide through eBPF and through the Tetragon project, combining in a really performant, efficient security tooling. At the moment, everybody’s focus in security seems to be on supply chain, but they also still have firewalls. I’m quite a big believer that we have runtime security, everybody has runtime security in the form of firewalls.

We just were on the cusp of people understanding how powerful this new generation of runtime security tools can be to essentially firewall, not just Network packets, but things like bad executables or unexpected privilege escalations, that kind of thing.

Mike: Does the breadth of the product ever feel like a curse? Wouldn’t it be so much easier if there was just one application, and we can focus the marketing message and the sales, and all is just this one thing?

Liz: I’m sure the marketing team tasked with coming up with a tagline would find it a curse, yes.

Lessons for Open Source Startups?

Mike: So you’ve been in the techs business for a long time, taking off your Isovalent hat for a second and just as an observer of the startup scene, and other than the open-source scene in, do you have any advice for particularly entrepreneurs? Because this podcast is really designed first for founders, any advice for founders?

Liz: Yeah. This is actually something I’m getting increasingly interested in and I’m working with the CNCF on how we can encourage businesses on how to operate and be successful with open-source based businesses. There’s two sets of vendors who I would say have quite a lot to learn, particularly if they come into like a Cloud Native community audience.

There’s one class of vendor who is open-source based, they have an open-source project that they’re building their business around. The second class is people who are not open-source, but they have a product that they want to sell into the primarily open-source based Cloud Native community.

I think for both those sets of people, really understanding how powerful community is, Cloud Native community is kind of where I’ve lived for the last, I don’t know, half a dozen years. And it’s incredibly powerful, the relationships that you can build up – not just between individuals, between organizations, can be a really solid foundation for the business relationships that you then build on top of that.

And I think the real lesson for a lot of vendors is: don’t just expect to turn up at an event, pay for a booth or a table, and expect people to come and buy your software. Invest in time as well, invest in contributing, get involved in our project, get involved in the cigs and tags.

Don’t just expect people to immediately think that your open-source project is the one true amazing solution. Take the time to learn what other people are doing around that, and then, have those conversations about why your solution is great and what its strengths and potentially weaknesses might be. Learning to get involved in a community is really, really important.

Closing Notes


Mike: Well, I think that brings us to a close. Liz, thank you so much for sharing and best of luck with Isovalent and Cilium.

Liz: Thank you so much.

Mike: Again, special thank you to Amanda Brock and the whole open UK team for launching the State of Open Conference, where we recorded this episode. Cool graphics from Kamal Bhattacharjee, music from Broke For Free, Chris Zabriskie and Lee Rosevere.

Remember how Liz said that eBPF and Cilium are really good for Service Mesh? Well, remember that, because next week’s guest is Idit Levine the founder of Solo.io.

Until next time, this is Mike Schwartz, and thanks for listening to Open Source Underdogs.

Episode 62: Amandine Le Pape, Element CO-Founder / COO, Messaging and Collaboration

Almandine Le Pape is the Co-Founder and CEO of Element, the the company behind the Matrix protocol, which deines a “chat” and collaboration protocol that enables federation across Slack, Rocket.Chat, Element, and many other implementations.

Episode 55 – Miguel Valdés Faura, CEO and Co-Founder of Bonitasoft

Intro



Mike Schwartz: Hello and welcome to Open Source Underdogs. I’m your host, Mike Schwartz, and this is Episode 55, with Miguel Valdés Faura, CEO and Co-Founder of Bonitasoft.

Not every tech company follows the same trajectory to success. Hypergrowth is great if your market supports it, but the world of infrastructure software is diverse, and hypergrowth can subject your business to unreasonable risk.

To me, Bonitasoft was a reminder that a CEO’s responsibility can transcend shareholder value. While the primacy of shareholder value seems axiomatic in Silicon Valley, it’s worthwhile for entrepreneurs to weigh that risk. Miguel and his team did just that, and their success validates the idea that business models are not a one-size-fits-all proposition.

As a side note, as I was doing my research, I noticed that Miguel has interviews in Spanish, English and French. American CEOs are lucky to speak two languages, but three is pretty exceptional. Anyway, I hope you enjoy the interview. This was the last of 2020. So, without further ado, here we go.

Miguel, thank you so much for joining the podcast today.

BPM Market Overview

Miguel Valdés Faura: Thank you, Mike, for having me.

Mike Schwartz: So, although this is a business podcast, you’re a technical founder, and sometimes, it helps to have a high level of understanding of the market. Business Process Management, or BPM, it’s still an important way to think about how to apply technology, but the technology landscape has changed so much since 2001, I guess when you started the project, and even since 2011, when you started Bonitasoft. Why is BPM still a good way for companies to think about how to build applications?

Miguel Valdés Faura: Good question. So, it’s because companies – I like to say that it is all about processes, a ton of processes that are required to run a company, some that are more critical than others, but BPM technology has been here for a while to help companies, to rethink, re-invent and automate their processes, whatever, they are critical or not. Also, I think it is something that is here for a wider dimension, and of course, the market is evolving because also the needs of those processes are changing in organizations.

Project History

Mike Schwartz: So, the Bonita project itself started at the French National Institute for Research in Computer Science. The project was transferred to the Bull Group, and then, in 2009, you started BonitaSoft with Charles Souillard and Rodrigue Le Gall?

Miguel Valdés-Faura: Exactly.


Mike Schwartz: And also, over the years, how is the community grown? Is the Bull Group still involved, and are there other important contributors in the ecosystem?


Miguel Valdés Faura: BullGroup, which is now part of Atos, at the origin, is involved, but as a partner. It is one of those hundred employees, partners that we have – I’m talking about Consulting and System Integrators Partners that helps customers worldwide with the Bonita implementation, but nothing more, meaning that over the years, Bonita self has grown into an international community that goes beyond specific companies, but, also, having individuals working sometimes as freelance models, as part of the bigger companies.

And I think that’s one of the main achievements now. We have now a community of around 150,000 individuals working with Bonita, not all of them of course are contributing, it is only a small portion of this contributing code, but there is people participating in answering questions in the forum, or translating the products – there is a lot of activity in the Bonita community that is not relied only on one company.

Why No-Code Is No-Go?

Mike Schwartz: In an interview a few years back, you said that the no-code approach does not open the possibility for developers to write code that meets business needs. Can you expand on that? Don’t business people love drag-and-drop GUIs, to build BPM workflows?

Miguel Valdés-Faura: Yeah, a good one. So, probably, it was referring that with this new trend of local done, this new kind of developers, the thing some analysts were calling business developers, at some point, we were facing with people that are not skilled in development to build some complex applications, and at some point, they’re going to face some limitations. Of course, a lot of people like to build on applications, using drag-and-drop, as I mentioned, or visual tools, but when the application gets more complex, or when you need to customize a little bit more the application, at some point, developers need to be part of the game as well.

So, I’m not saying that it’s not useful to have business people participating in the development projects. I’m not saying that the local movement is not something that is real, I’m just saying that we need to find a balance between things that can be done graphically, and first that require code, and it’s about how those two different skill sets can collaborate, how business people or people without development skills, can also work on the same project with developers.

Probably, those two personas are not going to use the same thing.

Customer Profile

Mike Schwartz: Thousands of organizations use Bonitasoft, but switching to the business side a little bit, from a revenue perspective do you see the 80/20 rule, where 20% of your customers make up 80% of your revenues? And if so, what does that 20% segment look like, with regard to use cases or industry verticals?

Miguel Valdés-Faura: In terms of the verticals, of course, I think it’s not only something  -particularly Bonitasoft, all BPM vendors, you know, have a lot of traction in market that are highly competitive. So, for example, insurance, banking, telecommunications, because there is a lot of pressure to do better than the competition, because there is a lot of processes that are related about how you provide better services to your customers, and how are you going to retain those customers by providing good services.

So, those will be probably the main four sectors in which Bonitasoft is evolving and getting customers, and also, potentially the ones, in which other vendors are also evolving. In terms of the split or the size of the customers that we have, we have this idea from the very beginning to focus on medium and large organizations.

So, there are some BPM vendors that are focusing on smaller implementations, we are really focusing on complex implementations and meet large organizations. So, the majority of our customers, like 75% of our customers, will match that criteria. And the majority of the project implementation inside those projects are either core or critical to their business. We usually don’t start working with a customer in less critical business process, but this is part of our strategy. And, of course, our product is better suited for those complex implementation.

Value Proposition

Mike Schwartz: Kind of a basic question, but what would you say are the most important value propositions for your customers?


Miguel Valdés-Faura: First of all, we are selling a platform, not a product, so, what we want is like to bring together those two personas that I was referring in a previous question, so business people or less skilled people, in terms of technical skills, and how developers can work together. So, we have a platform, in which you have clearly separated the visual programming capabilities versus the coding capabilities. So, in a sense, we are taking the benefits of the majority of things that we see in an open-source project. So, extensibility, open architecture, which APIs, compatibility with other open-source technologies that are things that appeal to developers. And at the same time, we have an integrated platform, a unified platform, that is also providing visual capabilities to less technical people. And, also, this clear separation in which, depending on the skills that you have, you can use some of the capabilities of the platform, and depending of your skills, you can use some others – these are the things that make us different, and that people like about our solution.

Monetization

Mike Schwartz: Bonita project is open source, and Bonitasoft has a platform built around that – how exactly do you monetize?

Miguel Valdés Faura: So, we sell subscriptions – package additional capabilities to the open-source version, and also, some professional services. And those subscriptions, minimum is an annual subscription, are sold either for people that are deploying the Bonita platform on premise, or people that are using our cloud offering now. But, in two situations, we are basically adding capabilities on top of the open-source solution, like for example, monitoring capabilities and scalability. And we package that together with a professional support, SLAs, contractor warranties, as part of this subscription. Also, it’s a 100% of our probably related revenue is a recurring revenue.

Cloud Strategy

Mike Schwartz: Cloud hosting is really a great business model, and I heard you mention that you have a hosted offering. How has the hosted offering evolved over the years, and do you see that becoming sort of the most important way that you deliver the software? Would you say self-hosted is still going to be more important from a revenue standpoint?


Miguel Valdés Faura: Yeah, a good question. I think in our space, the BPM space, and particularly because of the nature of the projects that we target in our customers, as I was referring as core or critical, we still have a lot of people using the on-premise version, especially in banking insurance that are sectors that are still using a lot of on premise, or they are starting their cloud movement, using public clouds, but not really externalizing everything to SaaS solutions. So, on-premise is still really a big majority, but we have released our cloud service 18 months ago, and we already see a traction. So, there is more and more customers also embracing that new offering – I will say today is more like 80/20. We expect that this is going to change.

It took us a while to offer a Bonita Cloud version because we didn’t show a lot of demand previously. We, as I mentioned, we started seeing some companies that are more and more interested. We really believe that it’s going to be maximizing in the next years, but again, the on-premise is still the number one option today for our customers.

Prioritization Of R&D

Mike Schwartz: So, how do you prioritize your R&D effort, because you’re still contributing to the open-source project, but you are also building your commercial like extra features. And how do you prioritize R&D?

Miguel Valdés Faura: That’s a tricky one for every open-source company. Because you need to make also clear rules about what are the developments that are going to go open source versus the ones that are going to go commercial, and the same applies to the teams – do you have the same organization working on the two kind of features, do decide to have different organizations?

So, we have evolved over the years, but one thing hasn’t changed is that we have defined from the very beginning clear rules about what is open source and what is not. For example, we didn’t want our open-source version to be something that cannot be put into production, because that was not the essence for us, the essence for us as open source.

So, the open-source solution at Bonitasoft you can develop, and you can put it into production, however, for example, as soon as you’re talking about scaling – if you need to CCP, if you want to do clustering, those are the kind of things that, from the very beginning, have only been available in the commercial version.

Also, first of all, is about defining the rules, so, your development team knows what goes into one edition versus the other. Not only your development team, but also of course the community, the community using the open-source version and also your customers – it needs to be really clear. Secondly, over the years, we have evolved, also, in terms of how the development team is a structure, to be more focused on one product, one edition, meaning, one set of people for developers working, one part of the product that is either open source, or is commercial, which, of course, is a way simpler to manage from a management point of view.

Cloud Native Opportunity

Mike Schwartz: In the Cloud Native world, scaling is sort of table stakes, like Kubernetes out of the box is clustered, and my company Gluu, we’ve decided that we’re going to make scaling sort of part of the open-source, just because it seemed like it’s hard to get adoption in the Cloud Native world unless you support Kubernetes, and Kubernetes has clustering.

Do you see a similar trend in the BPM market? And are any challenges or opportunities around Kubernetes and the move to Cloud Native?


Miguel Valdés Faura: Even before Kubernetes, the move that we saw was the adoption of Docker. So, four years ago, we started to demand Docker super, as a way to use and deploy Bonita. So, that’s one of the first that we did. So, to certify a Docker image for people wanted to start their projects, it took us depending of the geography some time, we got that traction from the US, a little bit less in Europe in terms of adoption of the Docker image. Now, it’s a reality – there are more and more people using that. And, of course, those people are also asking now, “Okay, let’s combine that with Kubernetes.”

We have decided that Bonitasoft, that this is part of the kind of the capabilities that we can provide as part of our Cloud Edition. So, the elasticity capabilities that are offered to our Cloud customers is based on Kubernetes. And I think that the value to the customer is that we are able to manage that automatically for them.

This is something that we are at Bonitasoft proposing in our Cloud offering. But if someone wants to do it on premise, and they want integrate, the current Bonita on-premise version without the Kubernetes and manage Elasticity, they can do it.

But at Bonitasoft, we have a package to make it really simple for people who want to use the Cloud service.

Growth While Pivoting

Mike Schwartz: As you know, investors are super-focused on top-line growth. They want growth, growth, growth, but when there are major technology shifts, like from 2011 to today, seems like a different world. It’s hard enough to survive, let alone to grow a 100% per year. Can you talk about some of the challenges of achieving this high level of growth, especially if you have to pivot at the same time, like you probably did over the last couple of years?


Miguel Valdés Faura: A really good question. I mean, you know, it looks like hopefully things are changing, but when we started Bonitasoft off in 2009, and especially in the years that follows, looks like everyone needs to become a hyper-growth company. And of course, I really was trying to raise a lot of money, and we did it as well as Bonitasoft. And, of course, raising a lot of money means also at some point delivering really high growth. But things are changing, and I think that that’s okay, and that’s possible in some situations, it’s something you need to also be willing to do.

We wanted, at some point, growing the company that way at Bonitasoft, especially at the beginning, we decided to change. We decide to change because we wanted to build a more sustainable business, and of course, the level of research you take, if you are always following the hype- growth is a big risk. Because, of course, you are depending a lot of on money from investors, usually high-growth means high losses. So, you need to raise money. Of course, missing some of your targets can put the company at risk.

So, we decided five years ago to change, and embrace what we call a sustainable growth business model, in which profitability scheme for us, in which we try to grow as much as we can, if the company is profitable, and learn in environment in which people are enjoying their day-to-day work.

Now, we have to switch from one to the other, and I think that the pandemic that we are living these days is also reminding us that potentially that’s also a model that some other companies should consider..

Transition From Growth To Profitability

Mike Schwartz: That’s very interesting that you’re saying, “switch to high-growth as long as its profitable”, but how did you manage a relationship with your investors? Were they on board with that, or was there some friction around, saying, “We don’t want to accept this high level of risk?”


Miguel Valdés Faura: You mean, at the beginning, or when we decided to change to a more sustainable growth model?

Mike Schwartz: When you decided to change.

Miguel Valdés Faura: I think they were happy to see that after seven years of existence, we wanted to start looking to profitability. I think at some point that’s important for a company. And so, they were okay with that. And, then, of course we think that we have another kind of discussion with them because we are not asking any more money, the company is profitable for the last four years. So, then, do we need to deal with all the things like, okay, are we looking for an exit, are we looking to grow and do some acquisitions that we want to continue to grow the business organically – but, in any case, you are not forced to raise money which I think is good for us, and in some situations also good for investors.

Building The Sales Team

Mike Schwartz: So, it’s the technical founder one who’s been on the business side for a long time. Building a sales organization is really challenging – is there anything you’ve learned about building the sales team that you’d like to share with startup founders?


Miguel Valdés-Faura: Yes. It’s maybe because I’m also an engineer by training, but, of course, we did a lot of adjustments in the sales organizations over the years, and we’ve made a lot of mistakes and we’ve learned a couple of things. We made some great success, but you know, for the last four years, we are operating with sales methodology that probably you know this, it’s called Customer Centric Selling methodology, which is really focus on the value that you can bring to the customer, that is more focused on quality versus quantity, in which you do, not a lot of prospection, but you are really trying from a marketing perspective to have people really interested in having a discussion with you, and spending a little bit more time and trying to provide a solution that is, as I mentioned, to the problem.

So, then, you can surely acquire a new customer, but also make sure you can renew over the years. And this is one of the big things that we did. And we did it by having a mix in the sales team, people that are coming from different backgrounds, including engineering.

And I think that’s one of my first learning is that you can’t have people that have an engineering background that are doing exceptionally with that, and I think we’re seeing that with more and more companies.

Second, you need a methodology that is really focusing on providing value and delivering value to the customers. And this methodology needs to also be shared with marketing, and needs to be shared with the rest of the organization, including product teams know. And that has been a big change for us. Of course, we didn’t need that from one day to the other, but that move to this new methodology, having the right mix of people and focusing more on content and maturity of our leads than on quantity and prospection, has made a big difference for us.

Partner Strategy

Mike Schwartz: You mentioned that Atos was still a partner, and perhaps, there are other partners who are either bringing you business or you see as critical. But can you talk about like the role of like how the partner strategy has evolved over the years?


Miguel Valdés Faura: Today, we have three different kind of partners – we were talking about Atos, we have a category that we call Consultants and Systems Integrators Partners. As I mentioned, we have something like a hundred and plus of those partners, so, including CGI, including Atos, including Sopra, and then, other things that you in the U.S., you call it more boutique-like partners, or people that are more specialized in one particular sector. So, implementing projects in insurance or in banking. For example, in the U.S. people like Evoke, in Latin America people like Indra – this is one category. Those kind of partners are helping us either to identify new opportunities and also to do the implementation.

By the way, 62% of our new business is influenced by those consulting partners. Second category will be the technology partners. So, there’s no surprise here, this is about integration of our product with other products in a similar market. So, for example, we have those kind of partnerships with the UAiPath in the RPA space. We have this kind of partnership with a DocuSign. So, basically that means bi-directional integration between the two products. And I joined go-to-market, in which we think that the two products combined can bring more value to the customer.

And the third type of partners that we have are OEM Partners. So, it’s people or companies that are embedding our technology and reselling as part of their product. So, to name one that is more representative. Talend is doing that, Talend is that integration leader that is embedding Bonita as one of their offerings. So, those are the three kind of partners. And of course, this thing has been evolved, and has been over the years. So, we started with putting a lot of effort on Consulting and System Integrators Partners, and then we started to focus, in a second step, on more of the technology side of the story.

OEM Patnerships?

Mike Schwartz: You mentioned OEM partnership, which is interesting for open source, because I think that companies who want OEM can use the open source and become part of the ecosystem. What is the driver for a company to OEM in open-source product?

Miguel Valdés Faura: A good question. I think is that the nature of the technology that you are embedding, if you are embedding just Log4j for logging – that was, at least, used 15 years ago, – or Hibernate for persistence. Potentially, it’s the same done embedding, BPM engine or Workflow engine.

So, if you are embedding a solution, that is more like a project or a platform, that is in some way critical to the other solution that is embedding, potentially you’re going to look for, not only can I do it from a license perspective, but also, potentially, you are going to contact the other company to do a partnership. So, that’s what’s happening a lot in our ecosystem – embedding a VPN engine or embedding the whole platform, embedding a workflow solution is something that’s potentially going to be used for mission-critical things.

So, if that is the case, even if the license allows you to do it, potentially, you are going to also look for some help from the company that is building that. And of course, then, it could be also an issue with the license. You know, some of the licenses, for example, the GPL license are not allowed to embed directly without having an OEM equipment in place or changing the license. So, it could be either a license issue, or it could be that you need some helping if something goes wrong.

Licensing

Mike Schwartz: I normally don’t ask about license because I’ve actually been thinking about doing a whole another podcast, or maybe in a season or something, just on licensing, because it’s a complex topic.

Miguel Valdés Faura: Yeah.

Mike Schwartz: And, of course, Bonitasoft’s project’s been around for a long time, but is it GPL license – can you just talk for a second about the open-source license that you’re using, and maybe why?

Miguel Valdés Faura: The open-source project is really under the GPL license, and it’s more historical reasons, this is how we started the project. You know, at that time, it was the time when MySQL was – those kind of projects were appearing, it was the time of Enterprise middleware – so, we kept that license because that was also all discussions around open-core business model. And we didn’t change the scene that, for example, we are now also launching new ones, new products in which, we are also moving to some other license like Apache or MIT. But we kept, for the Bonita project, the GPL license because this is the one that get everything started.

Mike Schwartz: It sounds like the less permissive license actually has benefited you. But I think there’s sort of a knee-jerk reaction or policy among entrepreneurs these days to use permissive license, like MIT or patchy, but it sounds like GPL actually helped you in this case.

Miguel Valdés-Faura: Yeah, it kind of helped, for example, we’re talking about the OEM, it can help the OEM space, some of the people are going to see that there are some restrictions, and then, of course, there is this debate about, okay, but if I’m burying a GPL library, it’s going to be contaminating my project, but usually when you have that issue is because the project that you are building is usually something that you want to follow the open source, you want to commercialize something, just by liberating all those people work in open source, so, yeah, as you mentioned it, it’s always a complex discussion.

But, yeah, I think there are some benefits of using GPL, there are potentially some drawbacks depending of what do you want to build with that license – I think it depends. So, it is not magical rollover for what is the best license to use in your next project.

Keys To Growth In 2021

Mike Schwartz: So, there’s a lot going on today. We have the pandemic, moved to Cloud Native, changes in paradigms, like continuous delivery. What do you think are the keys to growth in the next few years?

Miguel Valdés Faura: I think nobody knows. I think we need to be humble, especially with everything and all those things that are going on, that are going on these days. But you know what, I will be back to my – what I was talking about the sustainable growth. I think that more than ever, being in a business, running a business in which you know that you are profitable, that you are of course trying to maximize, and you are ambitious to maximize the growth, if you are still profitable, having a strong customer base that this renewing year after year is what makes a big difference, especially when there are some situations that they want to do, are facing now.

Because, of course, if you don’t have that, and for some reason, you just stop signing new customers, or signing the new customer database that you were signing before. If you have a strong customer base, you’re going to suffer more than others. So, I will be back to that concept of sustainable growth because I think it’s what makes the company less risky, more sustainable in the long run.

Advice For Entrepreneurs

Mike Schwartz: You know, startups are roller-coasters. I personally don’t recommend starting a company, especially a tech company, to anyone who’d asked me, but for those people crazy enough to dive into entrepreneurship – do you have any advice for new entrepreneurs who are launching a business around open-source product?

Miguel Valdés Faura: I will have one. It’s obvious that I think it is good that we remember that from time to time, which is, there are no two companies that are alike, so, the same applies to founders. Don’t pretend to be somebody else. Of course, listen and learn from others in your ecosystem, but be yourself. And if you create a company, as you mentioned, if you are crazy enough to create the company, try to be surrounded by people that share the culture that you have in mind, the strategy that you have in mind. Don’t pretend to be a CEO that you are not. And that’s – I go back to – not all the companies need to be the same, not all the companies need to be unicorn, not all the companies need to follow the same business model, but you need to be really comfortable about the choices that you make, otherwise, it is going to be even harder than you know it simple journey.

Closing

Mike Schwartz: It’s 55 podcast. I always ask that question at the end – no one’s actually given that answer yet, but I have to say I agree with that a100%. So, thank you for being the 55th guest, and best of luck this year. And thank you so much for being on the podcast, Miguel.

Miguel Valdés Faura: Thank you very much, Mike, for inviting me. It was a real pleasure.

Mike Schwartz: Special thanks to the Bonitsoft team for helping us to schedule the interview. Editing by Ines Cetenji. Transcription by Marina Andjelkovic. Cool graphics from Kemal Bhattacharjee. Music from Broke For Free, Chris Zabriskie and Lee Rosevere.

This is the last episode of 2020. Next year, I’ll keep going, although probably at a somewhat slower rate. If you have any ideas for the direction the podcast should go in 2021, I’d love to hear your feedback. You can contact me on the website opensourceunderdogs.com. Happy holidays, founders! Hang in there, and keep an eye out for new Season 4 episodes after the New Year.


Episode 54: Justin Borgman, CEO of Starburst, the Company Behind the Presto Project

Intro

Mike: Hello, and welcome to Open Source Underdogs. I’m your host Mike Schwartz, and this is the episode 54, with Justin Borgman, Chairman, CEO, and Co-Founder of Starburst, the company behind the Presto Data Access Project.

Before we get started, I have a quick request – we all want to help open-source founders and startups. I make the podcast, but I need your help to get the word out, so tell your friends, post on LinkedIn, tweet out a link, post on Hacker News, or follow me and share one of my posts on LinkedIn, whatever you think makes sense, go for it.

One of the themes of Machiavelli’s the Prince is Virtu e Fortunavirtu meaning excellence in your domain, and fortuna meaning luck, whether good or bad. I really like how the story of Starburst exemplifies this 500-year-old insight.


Justin has a ton of domain virtu. He has deep technical knowledge, but he’s also on the lookout to harness fortuna. He’s one of the few podcast guests to acknowledge it. And Starburst earns its name because it’s one of the most stellar open-source business success stories I’ve heard in the last few years.
There’s so many great insights in this episode, a lot to think about. So, without further ado, let’s get on with the interview.

What Is Presto?

Mike: Justin, thanks for joining the podcast today.

Justin: Hey, Mike, super glad to be with you.

Mike: Before we dive into the business stuff, I find it’s helpful to talk a little bit about the technology. Can you start by giving a brief history of the Presto project? What it’s good at, and how the community coalesced around it?

Justin: It was really back in 2012 for developers at Facebook, Martin, Dain, David, and Eric came together to create a new infrastructure project that would be a faster way of querying data at Facebook. Facebook, of course, collects massive amounts of data, hundreds of petabytes worth of data , and needed a faster alternative to a prior project that they also developed and they called Hive.

Hive was a SQL engine for Hadoop, and it just wasn’t fast enough. So, Presto was created to be a faster means of accessing that data. But it has one really important differentiation in addition to the speed, which is the ability to access data anywhere. So, it’s like a database without storage – that’s kind of one way to think about it.

So, it looks at storage in other systems, which could be Hadoop, it could be S3 and AWS, it could be a traditional database, like Oracle, or Teradata, or Snowflake. And regardless of where that data lives, Presto can reach it, query it, and deliver SQL-based analytics.

So, that’s kind of what makes it special, is the ability to access the data everywhere. And that’s gained particular momentum, I would say more recently, as many large enterprises have data silo problems, where they have data in a bunch of different databases, and are now perhaps moving to the Cloud in some fashion.

Mike: And if I’m not mistaken, high concurrency is one of the areas that make sort of this data access plain different?

Justin: Yes, exactly, it’s very fast, and can support high concurrency. And in a lot of ways, this technology was sort of, I like to say built in reverse, in the sense that it was tested at ridiculous scale from day one. You know, very often, when you start something new, you don’t really know how it’ll work at scale until you get people using it. But because it was really born out of the internet companies, Facebook, and Uber, Airbnb, and Netflix were all early adopters to use the technology, it was really tested, and at scale, and as a result delivers great performance and concurrency.

Origin Story

Mike: Starburst is not your first company, you are part of a team at the company called Hadapt that’s sold to Teradata in about three and a half years, I think.

Justin: Yep.

Mike: How did that experience lead you to Presto?

Justin: In a lot of ways, this is really a continuation of that journey that began 10 years ago. So, that was 2010 that I started Hadapt. Hadapt was a spin-out actually from Yale University and the computer science department – there’s some research called HadoopDB, which was pretty pioneering research at the time, in terms of thinking about Hadoop as a data warehousing solution, and being able to deliver fast SQL analytics on top of Hadoop.

So, we spun that out, raised Venture Capital, built that business over nearly four years, as you mentioned, and then sold it to Teradata. We had ups and downs, definitely lessons learned through that experience. And I think, really, my discovery of Presto after arriving at Teradata in 2014 was kind of an exciting opportunity to reimagine the strategy that we had with Hadapt.

So, Hadapt was the SQL engine for Hadoop, Presto is a SQL engine for anything essentially, allows you to access data anywhere.it was an opportunity to basically take all the lessons learned from the first experience and start to apply them over again.

It was actually my team from Hadapt that ended up contributing a tremendous amount of software to Presto, and working with the guys at Facebook, who created it to really make it an enterprise-grade piece of technology. And I think, as we started to see Presto get more and more capable, and see more and more people use it, that was what created the idea in our head that maybe there was a business to be formed around this.

Community Engagement

Mike: It’s a really interesting opportunity, and I can’t actually think of another example like it, but when I’m talking about open source, I sometimes talk about three types of open-source companies. One would be volunteer, where a bunch of guys or girls get together and write some piece of software that they love, but not necessarily for a business.

And then, I talk about corporate open source, where there’s some piece of software, where a company funds it, but it’s not their core business, but then, they realize that makes sense for them to collaborate like Kubernetes, let’s say ,and Google, and these pure-play, open-source companies, where the company behind it is developing it, and they’re the main contributors.


And so, lots of great open-source projects come out of this corporate open-source area, the podcast that is mostly focused on pure-play because they were trying to help entrepreneurs and founders start open source, use open source as part of their business model. But you’ve sort of, like, created a very interesting situation, where you have a mix of corporate and pure-play because you’re benefiting from, not just the community, but, really, Facebook is a big contributor to the project to — I heard almost 50/50. So, how’s that really evolved, and how do you continue to encourage this very symbiotic relationship?


Justin: You’re right. Preston has a very interesting history to it, an interesting journey. It started as a small project at Facebook. When we got involved at Teradata, we were able to apply a few million dollars a year of R&D budget into advancing that as well. And then, of course, you’ve got a few other companies contributing also along the way.

And, as a result, all of that kind of accelerates the development of the project. And I think that maybe what’s most unique here is not only that Facebook created great infrastructure software as a byproduct of their business – they’ve certainly done that before – but rather that there was kind of a commercial partner very early on, and myself, and my team at Teradata thinking about the commercial applications of this.

So, you know, back in 2014, Presto was still in its early days, Facebook wasn’t trying to monetize it obviously, that’s not their business, but we were already thinking about how this could be used by Fortune 500 customers, and what difference this could make to their business. And I think that led to its very enterprise-applicable evolution, and set us up really well to eventually commercialize this in 2017, when we left Teradata, the creators of Presto joined us from Facebook. And we went off on our way to build this business.

Idea Incubation

Mike: So, you were working on Presto while you’re at Teradata. And did Teradata ask for any equity, or how did that work when you told Teradata, “We want to start this company basically working at Teradata? Like, what was that like?

Justin: Yeah, well, what was interesting about that – and I guess just to set the context, I think Teradata, from 2014, when they acquired my company through to probably today, has gone through various iterations of kind of rethinking their overall strategy, in terms of how they evolved into this next generation of sort of Big Data platforms. Because they had great success in the ‘80s, ‘90s, and early 2000s, as this kind of monolithic data warehouse, where you would ingest everything and store it in one place.

But obviously that became very expensive over time. And the appliance model, hardware and software combined, wasn’t necessarily set up for this future as people move to the cloud. So, they’ve gone through a lot of iterations. And it was really in that iterative process, where they weren’t really clear where they wanted to go, that they actually felt like Presto is maybe a distraction for them.

So, that actually created the opportunity, I think, for us, to say, well, we think it’s a little more than a distraction. And, you know, we’d be happy to sort of take that off your hands and work on this together.

So, it was a very amicable split – we remain partners, we’re still partners today, where we work together on some customer accounts, the technologies work together, we can access data in Teradata, for example, from Presto. So, that partnership remains. But it was one that I think for them, they viewed us as sort of taking Presto off their hands because there were maybe close to a dozen companies within their customer base that were using Presto. So, we were able to deliver really first-class support to those customers, you know, not provide any interruption there, even as we left and formed this new business. So, they don’t own equity, it’s purely a partnership.

Identifying Opportunity

Mike: It’s just amazing like how you deal your business, is you got a huge company Facebook to help you grade and test this infrastructure. You got to do R&D in Teradata, and then you started the business with customers – it doesn’t get any better than that really.

Justin: Now, you’re absolutely right. And believe me, the good fortune is certainly not lost on me. You know, advice I give to entrepreneurs of any type, not just open-source entrepreneurs, is to just have your eyes open to opportunity. I think it passes us all by all the time, and very often we miss it. And I think seeing it, and then, you know, running and jumping on it, it certainly has been beneficial in my career. I’m even going back to my first company and spinning out technology from Yale, which you could argue was the great benefit of various government research grants, funding that research in the first place. So, keeping the eyes open and seeing an application for where it could become a business.

When To Raise Money

Mike: So, initially, you didn’t have to raise money because you had some customers that came that provided some runway, but you did raise a series A, and I guess, October 2018, so, pretty recently. So, what was in the decision process to say, “Okay, now capital is going to help us.”, like what were some of the benchmarks that you reached, that helped you say, “Now is the time we should do that.”?

Justin: So, that’s exactly right. We started without raising any capital. That allowed us to build a profitable cash-flow positive business over those first two years of operating, which I think, by the way, as an aside, gave us a lot of opportunity to be patient and sort of think through exactly what we wanted our go-to-market strategy to be, what kind of strategy we wanted to take around monetization.

And we didn’t have the pressure of investors necessarily breathing down our neck, which I think many, many entrepreneurs have in those early days. So, I think it was a great way to start a business, what forced us to change and actually consider taking capital was really a realization that the market opportunity was bigger than we felt like we could actually satisfy growing at purely an organic rate.

So, we took that series A really as a growth round, you know, even though it’s called the series A, I think it’s a little bit misleading, because it’s probably more like a series B for most companies in that. Not only was it a large amount of money 22 million in that first round, but it was really deployed towards expansion and rapidly growing the business. Less so about proving product/market fit, which is more typical in a series A.

As you said, we did a series B shortly thereafter, which was probably more like a series C, adding another 42 million. So, we’ve gone from raising nothing to now 64 million. And really I think that was all made possible by really building the fundamentals first. Making sure you have that product/market fit sorted out, and then, you know, applying fuel to the fire to expand.

Revenues Pre-Investment

Mike: What was the revenues when you raised the series A?

Justin: Yeah, well, if it was 12 months looking forward, I would say it was already looking north of $10M at that point. So, that allowed us to really take the funding and apply it to, again, expansion rather than kind of sorting out the basic product details.

Mike: And what year did you actually start the company?

Justin: 2017.

Mike: That’s pretty amazing – two years to go to $10M. It’s pretty stellar.

Justin: Thank you. I mean, again, I think a big advantage here was that, in some ways, this was like building the same company over again – I mean, there are a lot of differences between this and my first, but they’re also enough similarities, just in terms of the types of customers that we sell to, the types of use cases, the types of problems that they’re trying to solve. So, I think that historical knowledge was advantageous for us to just move a little bit faster this time around than we did that the first time.

Balancing R&D Investment

Mike: Okay, switching gears a little bit into more basic business stuff. You mentioned in one of your previous interviews that I listened to, that Starburst is basically pursuing an open-core strategy. So, performance, robustness, security patches that goes into open source, things like connectors, security, ease of use, I guess GUI deployment stuff, goes into the core. One of the questions that I’ve sort of wonder about is, how do you decide how to prioritize R&D in open source versus the enterprise features when you go the open-core route?

Justin: Yeah, I mean, I think that’s the key question. So, it makes sense why you’re asking it, and I think it has to be on the mind of every open-source entrepreneur. And it’s a delicate balance because, on the one hand, you want to make the open-source project as useful as possible to get widespread adoption. Because really that’s your lead generation vehicle – I think that’s the way to think about it.

A lot of people say open source is really just another form of a freemium business model. There’s a free component, and that just happens to be open source in an open-source model. And then, how do you kind of upsell to the Enterprise version. So, for us, I think the logic was, what are the reasons why people use Presto anyways in the first place.

And we think performance is a core element to that. So, we wanted to make sure that performance is always great, right out of the box, with the first experience of it, including the open-source version. So, that’s why a ton of work goes into open-source around performance enhancement, scalability enhancements, those kinds of things.

And then, we think about, well, what do people in enterprises, who are willing to pay for this stuff, what do they want. And that’s where it is, things like security features, which are just essential for any large, mature enterprise things, like role-based access control, data masking, if you’ve got social security numbers or credit card numbers, being able to mask digits appropriately, having audit logs for querying.

And then, because Presto access is all these different types of data sources, it also made logical sense that if you’re going to access a database like Oracle, or Teradata, or IBM, all of which are very expensive in their own right, well then, a customer, probably, is willing to pay for enhanced connectors to get faster throughput to those systems.

So, that was kind of the logic was trying to like think through what are the enterprise features that someone is willing to pay a premium for, versus what just produces an out-of-the-box great experience. Because I think so much about open source is really people doing their own self-evaluations of the technology. So, self POCs, if you will, so, you want to make sure that’s great, because you can’t control that. You may not even know who downloaded it in the first place. So, that’s where you really want to put I think a lot of energy into the open-source project. And then, it’s as more of those production features that are important to the larger enterprises, where those I think you can hold back.

Why Not 100% Open Source?

Mike: I interviewed Mike Olson from Cloudera, you might know him.

Justin: I do, oh, yeah.

Mike: He was one of my first guest, and he gave a very similar comment to what you were just saying. And he was quite emphatic about it. And yet, Cloudera recently switched to a 100% open-source strategy. And other open-source companies have also, for example, Chef, and of course some of the older, Linux distributions are, RedHat and SUSE are all open source.

And so, one of the things I’ve been wondering myself is, you can use the open-core strategy. It makes perfect sense I think to business people, but I also wonder, this license is paying for the right to use the software. Do you think that customers are actually paying for the right to use, or they’re paying for the engagement with your organization? And do you think, if you made it all open source, it would actually negatively affect your revenues, or customers would still want to engage with Starburst a company?


Justin: I think I can speak from experience here, because part of what’s interesting about our history is that we’ve kind of evolved through the various open-source business models in our brief history. So, when we first started the company, we didn’t have any proprietary IP, so we naturally just sold support contract. So, the early customers that we started with were just support contracts.

I think the challenge that we quickly identified is that support alone is not the most compelling value proposition. It is to some, I’m not saying it’s not, but it’s not a sufficiently compelling I think to win over a broad set of customers.

I think that’s where the open-core model, at least for us, really created an inflection in the business, where, you know, now we had a real tangible reason. And, by the way, for what it’s worth, I think we learn this actually from our own prospects, that those who are actually huge fans of Presto, who are huge fans of us even, who were champions of what we’re doing, but couldn’t quite get the purchase across the line in those early days and that first year of our operation, because they couldn’t justify or explain to their boss why one would have to pay for something that was free essentially. And that was the tricky conversation was like, “Well, you get this for free, why would you pay for it?” Like, “We don’t need support, you guys are smart, you can support this, right?” And those are the kinds of conversations that can take place. So, I think that’s where the open-core model is really helpful to the business.

Monetization Strategy

Mike: You’re selling a product that’s almost like a data access product, like I call the Presto Interface, and it connects two back-end databases. How do you price an interface, like what are the buckets – I don’t need to know the price but I’m just wondering like, how do you land and expand, and how do you set up the model, so that it’s easy enough for customers to understand, and you can charge enterprise software rates for it?

Justin: The way that we monetize this is based on CPU consumption. Technically, we actually anchor on Virtual CPU consumption because so many of our customers deploy in Cloud environments. So, that’s the underlying metric, and the reason that’s a good proxy for us is because basically Presto is a technology that scales out super effectively, and is leveraging compute-intensively to execute the query.

So, it’s basically, like, the more queries you have, the more data you’re accessing, the more complexity of the workload, and the more users who are hitting the system you talked about, the strong concurrency that Presto provides. Those are kind of the dimensions that drive CPU consumption up, and we just monetize with that. It’s a straightforward metric I think that customers easily understand, and seems to work for us.

Optionality

Mike: In one of your previous talks I listened to, you talked about optionality, and how you recommended basically that optionality essentially drives freedom – how does Presto help you get that optionality?

Justin: Presto creates optionality by virtue of being disconnected from storage, is essentially not having its own storage layer. I used the analogy in the beginning that we’re like a database without storage. The other way I put it for people who are familiar with data warehousing is, we provide data warehousing analytics without the data warehouse. That’s another way to think about it.

So, because of that, it basically allows you to think about Presto as an abstraction layer, above all the data sources that you already have. And you can kind of skip the complex and time-consuming task of having to move data around, create copies of data, ETL it, extract it, transform it, and load it into another system, instead you can just do that at query time, and access that data, and get your results.

So, that gives you a lot of flexibility, and I think one of the ways we’ve seen that play out is, we have a lot of customers that have a classic data warehouse, maybe it’s Teradata or Oracle. And then, they’ve got some kind of a data-lake strategy, and maybe that’s either Hadoop on-prem, or maybe it’s S3, or some Cloud-object storage.

And the first step might be to use Presto to just join tables between these two systems. You’ve got some kind of user behavior logs in your data lake, and you’ve got billing data in your classic data warehouse, and you want to be able to correlate the behavior with the billing, let’s say. That would be a very common use case for us. You can do that with a simple query and Presto.

Now, what that allows you to do then, as a second step, is, essentially, hide from your own end-users, be them internal analyst, data scientist, or even customers. Where the data actually lives, they don’t need to know that they need to go to the data warehouse to get the billing data, and they need to go to the data lake to get the user behavior – they’re just submitting a query, and they don’t know where the data lives anymore.

And by doing that, you’re able to actually decouple your end-user from where the data is stored and give the architects in the organization the ability to now decide, based on cost or performance, where that data should actually live. So, you don’t need to pay Oracle or Teradata tremendous amounts of money to store your data anymore. That is, of course, the most expensive storage you’re going to find.

You could instead choose object storage, like Ceph from RedHat, or there’s a company on the West Coast called MinIO, which creates S3-compatible object storage. And that’s very inexpensive, relatively speaking. And you can deploy all of your data, or start to migrate your data into this lower-cost storage, and still be able to access it, while your end-users are none the wiser to where the data lives – they’re just getting their results. So, I think that’s where you kind of get to create this optionality and be flexible about where you put your data over time.

Mike: In addition to the technical level, I always think about optionality as, does the open-source license itself also lead, or open-source infrastructure in general, also lead to more optionality and freedom?

Justin: For sure. I mean, I think the notion of not having vendor lock-in is really important to customers. Increasingly so, I think they’ve been burned over decades of very expensive technology that becomes legacy technology, and then, their stock and the pricing goes up. And they don’t feel like they have much ability to resolve that. And I think the open-source license in and of itself gives customers a lot of comfort, in knowing that, you know, a worst-case scenario, they can always roll this themselves, with the open source. But also, Presto is able to read open data formats, which is also great. Because I think data lock-in is probably the worst kind of vendor lock-in.

And in a traditional database system, once the data is loaded into the database, it’s kind of not easy to get access to or get the data out, without continuing to pay for that database system. But if you’re using open data formats, which we’d really pioneered during the Hadoop era, these are like ORC or Parquet, if you’re familiar with those file formats, you can store them anywhere and query them with a multitude of tools. You could use Spark to train machine-learning models, working off the same Parquet files that you’re querying via SQL for Presto. And I think that gives customers a lot of flexibility as well.

Open Source V. Commercial Market Size

Mike: I read a lot of articles about how enterprises are really moving towards open source, certainly when you look at the large consumer-facing services, like you mentioned, Netflix, Facebook, etc., they’re building a lot on open source. Then, you look at the size of the market, and you see that, actually, from a market percentage of open-source software is still only a tiny amount – is the move to open source really real, or is it more hype than reality?

Justin: When you say the market is small, do you mean measured in dollars, or what’s the metric there?

Mike: Dollar, yes.

Justin: Yep, makes sense. And that’s the key piece. I think it’s probably super widely used, but the percentage of open source that actually gets monetized is relatively small. And I think that’s what’s translating to the overall dollar amount, seeming small, relative, to the proprietary solution. I think if you measured in terms of impact to businesses and organizations, I think it’s actually probably the reverse actually, where you might have more open-source software having bigger impact than the proprietary.

But, of course, the challenge – and I suppose this is the purpose of your podcast – is figuring out how to monetize that effectively, so that you can build a successful business, while having that broad impact that open source provides. And I do think that, as vendors, we’ve gotten smarter over the years about how to do that.

I mean, the way I think about open-source business models over history is that it started with the sort of pure-play support model, just offering support, nothing proprietary. I think kind of Generation 2 was the open-core model that we’ve spent time talking about. You know, Cloudera popularized that, as did many other companies. And I think Generation 3, which is actually where we’re moving as well as a company, is cloud-hosted, SaaS offerings.

And, basically, being able to make part of the value proposition, the simplicity of the solution that you can deliver as a SaaS, and I think data bricks is a great example of that. So, I think that’s kind of the next frontier. And I think, as more and more open-source companies move in that direction, I think they’ll probably have better success in monetizing that background usage of the open-source. Because, there’s so much you can control now from a SaaS perspective to really enhance the experience, that is just easier for customers to use your SaaS solution, rather than having to maintain it themselves.

Starburst Cloud Strategy

Mike: I normally ask companies if they’re developing a SaaS offering. And I think that there are some companies where it’s been really successful like MongoDB, Eli Horowitz from MongoDB is emphatic that cloud is the best business model and everyone should be doing cloud. In doing the 50+ podcast, I found that the results have been mixed, where sometimes companies find that it’s a good way to reduce the try by fly time, where the cloud offering is a good introduction, but then the revenues are mostly derived from the enterprise, like self-hosted version.

And it takes a lot of effort to actually — it’s almost like a whole new product, like you’re building a software platform, a great software platform, and then, building the SaaS is almost like a totally new product in different business endeavor. What’s Presto done in this area? Are you working on it? And do you have any thoughts about how that experience is going, sort of making a cloud offering out of the software?

Justin: We definitely are working on it, and we have been actually for quite some time. And it is hard work. I think there’s no doubt about it, but I do think that some recent innovations around Kubernetes actually make this easier than it maybe was a few years ago. Because Kubernetes can kind of create a uniform, almost like operating system, if you will, that you can deploy your software within, and therefore, sort of create the software once, rather than having to have all these different kind of custom versions for different types of deployments.

I think that’s a game-changer. It’s certainly something we’re betting heavily on, as we approached that by trying to create the same experience, regardless of where customers deploy.

Single-Tenant V. Multi-Tenant SaaS

Mike: Most of the old cloud services were multi-tenant, but, are you thinking, like with Kubernetes, we could maybe build a single-tenant and deliver sort of like, “We’ll host it for you, you’ll host it.”, but it’s going to be sort of the same thing?

Justin: That’s exactly right, yeah. You know, I don’t want to give away too much of our strategy just yet because we haven’t released the cloud product yet. But I think those are really important concepts that you highlighted there, that we’re very interested in.

Building A Sales Team

Mike: So, something you must have done a really good job at is building the sales organization, because $10M in sales hasn’t happened by accident. And I think sometimes founders underestimate how difficult it is to build a sales and marketing organization – did you have any thoughts or advice you could share on, like how that went for you, like, how you pulled it off, like how do you do it?

Justin: Yeah. I think the first step I would say is trying to understand yourself as the entrepreneur – what the sales process looks like, like, what are customers buying, how do they understand the value proposition. And I’m a big believer in entrepreneurs selling the first few customers themselves. I think you learn so much, even from a product management perspective of what you need. You get to experience what your sales reps will experience when you start to scale up. So, I’m a huge advocate of that.

The second thing I would say is find a great sales leader. Because you know there are folks out there who have done this many times before, and know what it takes to sort of scale up a sales organization. And, certainly, that was impactful for us in finding our VP of sales, who’s done a great job of really scaling up that organization quickly.

Team

Mike: One question I had was, the pandemic has changed things were much more remote –  were you remote before the pandemic, and what’s your plan for growing the team in the next couple of years?

Justin: We were not entirely remote, but we did have some level of distributed nature to our team. Before the pandemic, we had major teams in Boston, the Bay Area, and then, actually Warsaw Poland as well, as an important development center for us. So, we kind of had to work across these three geographies, which are obviously spread out by 9 hours of time zones. And I think that gave us maybe a head start on the pandemic. But to be perfectly frank, I mean, I would much rather go back to actually having an office, and being able to interact on a one-on-one basis personally, with so many of these people.

Because I think what’s been weird for us is, we have scaled so quickly this year that I have not met probably half of our employees at this point, which is just a weird thing, to have grown the company so much. And the only interactions I’ve had have been over a Zoom call. So, that part I miss. I do think we’re all trying to make the best out of it, of course. And I think good best practices are sort of documenting everything, having frequent all-hands meetings, where you get everybody together, but there’s still no real substitute I think for one-on-one interaction.

Founder Advice

Mike: The last question, any advice for new entrepreneurs who are launching a business, and they want to use open-source software development as part of their business strategy?

Justin: My advice would be to think early about that key question that you asked earlier in the podcast about what your monetization strategy is going to be, and on along what metrics are you going to, or what criteria I should say, are you going to be separating the enterprise value proposition from what you give for free, and I think kind of have a strategy early on and stick to it. Because I think that will just make the decision-making process so much easier for you as you go along. You won’t have to debate each and every feature that you come up with – you’ll just sort of know because it will fall into a framework. That would be my piece of advice.

Close

Mike: Justin, thank you so much for sharing all this knowledge and experience with us.

Justin: Thank you, Mike. This was fun, and it was great meeting you.

Mike: Thanks to the Starburst team for reaching out and coordinating the podcast. Audio editing by Ines Cetenji, transcription and episode website by Marina Andjelkovic. Cool graphics by Kemal Bhattacharjee. Music from Broke For Free, Chris Zabriskie and Lee Rosevere.

Next time, we’re joined by Miguel Valdes Faura, CEO and Co-Founder of Bonitasoft, a global provider of BPM, low-code, and digital transformation solutions.

Until next time, stay safe, and thanks for listening.

Episode 51: Cloud Native Agility, Reliability and Stability with Weaveworks CTO Cornelia Davis

Interview with Cornelia Davis, CTO of Weaveworks, a leader in the cloud native infrastructure open source software ecosystem.

Episode 50: DataStax NoSQL solutions built on Apache Cassandra with Kathryn Erickson, Open Source and Ecosystem Strategy

Intro


Mike Schwartz: Hello and welcome to Open Source Underdogs. I’m your host, Mike Schwartz, and this is episode 50 with Kathryn Erickson who helps lead open-source strategy at DataStax. Founded in 2010 and currently employing about 500 people, DataStax was one of the first and most successful companies in the Apache Cassandra big data Ecosystem.


Kathryn has an engineering background. You can listen to some of her great deep dives into the tech on the DataStax website. In her role on the strategy team, she’s helping to lead the company into its next phase of growth and community engagement. I hope you’ll enjoy this episode. And if you do, don’t forget to share a link on social media. You can find all the episodes on opensourceunderdogs.com, or you can retweet our announcement by following us on Twitter. Our handle is @fosspodcast. So, without further ado, let’s carry on with the interview.

DataStax Origin

Mike Schwartz: Kathryn, thank you for joining us today.

Kathryn Erickson: Sure, of course, thank you.

Mike Schwartz: Most of our listeners probably know about Apache Cassandra, one of the most popular databases for big data, but how did DataStax evolved in relation to the Cassandra project.

Kathryn Erickson: DataStax was founded by Jonathan Ellis and Matt Pfeil, both employees of Rackspace. Jonathan, being contributor to Apache Cassandra and Project Share as well, was considering leaving Rackspace, and Matt Pfeil went to talk to him and say, “Hey, there’s some really cool stuff going on here, you should really consider staying.” And by the end of the conversation, they were founding a company together.

And so DataStax was founded to support Apache Cassandra. Over time, we began adding Enterprise features and selling an Enterprise distribution of the database with these features added, and then, of course, more recently, the cloud platform as a service offering as well.

Evolution Of Support Offering

Mike Schwartz: Actually, I didn’t realize that you started out providing support. Because when I first ran into DataStax, I guess I had just known it as a distribution of Cassandra. And now, I see that you’re also providing support for the open-source distribution. Can you talk a little bit about how that’s evolved over time? Has it always been there or has there been a focus on for or against doing that?

Kathryn Erickson: It hasn’t always been there. When DataStax was founded 10 years ago, there wasn’t really a playbook for how to build and run a successful open-source company.
We were founded around the premise of providing support and consulting for Apache Cassandra. Over time, we did, all for the Enterprise Edition, but what you see with most Enterprises is that they have a mix of the Enterprise version and open source. For some customers, that’s dependent on the criticality of the data, and for other customers, it’s dependent on the features or the distribution, being the as-a-service offering or self-installed on-prem.

And so, what we saw in the last year was that there were some obvious things that we weren’t doing, and our customers needed support and consulting around open-source Cassandra. We are beginning to open-source a lot more of the features that would build Cassandra abundance, and so, it made sense to bring those offerings back.

Astra – DataStax Cloud Offering

Mike Schwartz: Okay, and you mentioned that DataStax launched a new hosted service called Astra. Do you see that product as a driver for revenue, or is it just an easier path for customers to test drive the product?

Kathryn Erickson: I think that will evolve over time. I think at launch, it is the easiest way to learn Apache Cassandra. And I think as we launched the hybrid option, I believe that’s later this year, that would become a more significant line of revenue.

Pricing

Mike Schwartz: Most of the revenue today I guess is from the license Enterprise product, so focusing on that, a lot of open-source businesses are moving towards consumption-based pricing. And I’m wondering, what kind of metrics do you use to determine what is consumption?

Kathryn Erickson: You know, a cloud-based offering consumption is based on capacity. And with our licensed product and with Luna, the open-source support offering, our focus this year has been around simplification of the pricing model. And we revisit that each year.

With the Enterprise product, we previously charged for the Enterprise license, and then, an optional additional fee for advanced workloads, like Spark analytics and graph. That’s confusing for the customer, they just want a simple pricing mechanism. So, we collapse that pricing. And then, of course, for larger deals ,we would have ELAs, or special terms to accommodate those customers.


Mike Schwartz: That consumption is based on, like, per CPU, per server, or how do you actually figure out what is the size?

Kathryn Erickson: It’s true capacity-based, the size of the data set being stored. And as we move to Astra hybrid, which will be that offering on-prem, I think we’ll consider that pricing option there as well.

Market Segmentation

Mike Schwartz: Data persistence is like the most horizontal market on the planet. Every company basically needs to store data. When you can sell to everyone, it’s sort of a blessing and a curse. Do you segment the market at all vertically or by use case, or do you just not segment the market?


Kathryn Erickson: It’s hard to segment when you’re serving a pretty broad market. What we try to do is have as easy of an on-ramp for the different verticals as possible. We see data models look similar between IoT use cases, inventory and messaging data models would be similar.
So, we don’t segment the market for go-to-market strategies, but we try to find places of repeatable consulting efforts to speed up the successes for those customers.

Partnerships

Mike Schwartz: When you took on the role of director of strategic Pprtnerships, you probably did a survey of the range of partnerships that exist. Can you talk about like what is the partner landscape look like at DataStax?

Kathryn Erickson: I ran our technology partner program, and there’s two other sides of that, SI partners and the cloud partners. On the technology side, you want to make it easy as possible for customers to consume your product.

So, in a technology partner program, you want to understand the user journey to get to your product, and make sure that those adjacent technologies have the simplest most repeatable easy to build, easy to test integrations as possible over time. If you want to think about specific companies and integrations, every database needs an ODBC and JDBC connector. And customers want those for BI, for reporting, for simple ways to move data in and out of the system, but in the last few years, most customers also want to see Kafka connectors and more high-speed ingest Pub/Sub integrations.  So, we want to accommodate those as well.

Mike Schwartz: Coming on the System Integrator side, you know, at Gluu, we found that those have been essential for us, to be able to focus on innovating the product versus getting involved in specific projects. But there’s such a broad range when you’re serving a global market of the System Integrators. Do you consider them channel partners or integration partners?


Kathryn Erickson: We usually consider them strategic partners when we take those types of partnerships on. And the goal is usually to help us penetrate markets that we don’t currently have field team in, or packaged, or cookie-cutter solutions. If you look at some of the stuff that we’ve done with VMware and with partnerships at Dell, we want to assert that the product stack works as recommended for customers that are used to seeing these reference architectures from these larger integrators and technology companies.

Most Important Partnerships For Driving Revenues

Mike Schwartz:  Which partnerships, do you think are the most important for actually driving growth?

Kathryn Erickson:  Deloitte’s been in a role to our federal business, they know that space better than any startup could hope. VMware for helping to modernize Enterprise platforms. Enterprises that are looking at Cassandra and looking at DataStax are usually going through some type of digital transformation. And the product that they already have in place is VMware. So, everything that we could do to make that migration to know SQL smooth was helpful to those customers. VMware has been a pretty big partner in my journey.

Open Source Strategy

Mike Schwartz: Some of the companies we’ve interviewed are moving to a 100% open-source strategy, specifically Chef and Cloudera. In the past, the value property DataStax, it had improved distribution of Cassandra.But do you see DataStax maybe moving more in the direction of open-sourcing its platforms and some of that technology it’s developed?

Kathryn Erickson: We are open-sourcing a lot more. We try to stick to simple rules for open sourcing, simple rule is, it’s a Harvard Business review article, simple rules for a complex world.
And so, simple rules for open source, if it increases adoption Cassandra, it should be open-sourced. And if it’s Enterprise feature that’s more specific to Enterprise customers, like security features or advanced replication options, then that would be kept proprietary.

And then, where should something be open-sourced? Well, if it makes a change to the core of Cassandra, of course it should go to the Apache project. And if it increases abundance, but it’s not impactful to the core of the project, then it still should be open-sourced, but maybe able to exist in a DataStax repo or different foundation.

Does Open Source Help?

Mike Schwartz: Do you think the wider open-source community A Cassandra helps DataStax too?

Kathryn Erickson: Of course, open source is all about positive sum games. I think it was Thomas Jefferson that said, “If use my light to light your torch, then we both have light.” And that’s how open-source works. The more communities and more companies that you can move from being other to being self, the larger the positive sum game that you’re playing. So, it’s open source, and open-source abundance is absolutely essential to the success of any open-source company.

Thoughts About Open Source Foundations?



Mike Schwartz: Any thoughts about Cassandra being hosted at the Apache Foundation versus perhaps Linux Foundation or the CMSF?

Kathryn Erickson:  I don’t have any opinions on the other foundations, but I think that Apache Cassandra will always be at home with the ASF. They have their simple rules for what it means to protect the open-source nature of a project, and they don’t waiver. And for a vendor backing an open-source project, that can be like a Northern Light, you can lose your way, and you can always look back up and reorient towards the community.

But you know, there’s nice things when you see CNCF, you know, the marketing wing, and the power of the CloudNative messaging that’s there. But there’s no reason that projects can’t have pieces that exist in different foundations either.

We see ourselves and others that build communities operators or management APIs or drivers is an example, they should live in a project, but management tooling that exists that the maintainers of the project wouldn’t want entry. So, something like that maybe should live in a CNCF type of foundation that’s focused on CloudNative. But no Apache Cassandra will remain Apache, and that’s a tome.

Industry Changes In The Last 10 Years

Mike Schwartz: So, DataStax is one of more mature, well-established companies in the open-source ecosystem today. What are some of the challenges you think that you are looking at now that were different than when you got started?

Kathryn Erickson: When I started a DataStax, it didn’t always feel like we had a lot of competition. And I think as other good distributed databases emerged, we adjusted to having competition. I think the obvious answer that most people would expect is pressure from the public Cloud vendors. But if you stay oriented on the positive sum nature of open source, then that becomes easy to embrace as well.

So, there’s changes in understanding the virtuous cycles of open-source, understanding how to build software as-a-service more quickly as Kubernetes has matured that’s become a lot easier. So, I think the ecosystem around us has matured a lot, the playbooks around how to build a company around open source have matured. And there are more senior projects that kind of exist in our ecosystem that we can work with and learn from as well.

Is Open Source Table Stakes For Databases?

Mike Schwartz: You know, most of the databases that have been released in the last, let’s say five to eight years or so, have been open source. Is being open source basically like table stakes now? So, is it a non-differentiator in the database market?

Kathryn Erickson: I think that if you’re moving from a proprietary relational system, and moving towards NoSQL, then you’re obviously moving into an open-source world. And if you can choose something that has a security life, security blanket that you know will outlive any vendor behind it, then you should consider those options first.

I think that it would be hard to start proprietary databases without the support of the community and of these foundations. I think Snowflake has done an exceptional job and is kind of the exception to the open-source game. But, you know, they were disruptive in a much different way. NoSQL in general is an open-source family.

Data Platform Trends

Mike Schwartz: Just a general database question about the database market. So, we’ve interviewed a probably more database companies on this podcast than any other type of company, but have you ever seen a real shift in the way that customers think about databases.

In the old days, I think you just used to get one database and hope it did everything, but have you seen a sort of on the technology side a shift in the way that companies are thinking about data and databases now, with more SaaS hosted offerings and more database offerings, like in general.

Kathryn Erickson: Yes. I think I think this is definitely the age of data platforms. With Cassandra, we see customers considering NoSQL when they’re using the relational system. And it can’t support the throughput that they need anymore, or they need to replicate more geographies, or exist in a multi-cloud or hybrid environment.

And so, that’s when you consider Cassandra. If you look at when you might consider Mongo, you want to get quick start with a developer friendly environment that’s great for mobile. What you start to see is that there’s a certain fit for purpose that the different NoSQL databases have. We’ve started to see an emergence of multi-model systems that move forward. And consolidating those capabilities, we have that with our Enterprise products and their integrations for graph analytics and search, we want to help customers build high-growth applications, high-speed transactional applications are the sweet spot of any Cassandra deployment.

Advice For Startup

Mike Schwartz: This is a question, a sort of a generic question for entrepreneurs who want to launch a business around an open-source product. I’m wondering if you have any advice, for let’s say, startups? And it could be general and it could be about partnerships.

Kathryn Erickson: You don’t have to invent a path to success, you can listen to the A16 podcast, you can look at other companies that are out there. You can go through so many success stories on podcasts like this, you can listen to Cockroach, and there are Open Source Underdogs podcast talk about how they’re thinking about licensing other companies. You know, having similar conversations, really understand what has made other companies successful, and don’t try to invent that yourself.

How To Improve Tech Diversity?


Mike Schwartz: Last question. As you’ve might noticed, there aren’t enough women in the tech business, including there haven’t been enough women on my podcast, so thank you for joining. What can we do to reverse that trend?

Kathryn Erickson: I think there’s a lot that we can do. as You are on the side of making mistakes, just try things, and if it’s not the right thing or if it doesn’t work, try something else. We’re going to do a program at DataStax, you know, Jumpstart, if you’re a woman or a person of color, and you want to learn Cassandra, and you don’t know where to start, just hit the button, sign up. Somebody from the team will meet with you for 30 minutes and help you get started. That might work, that might fall flat, but we’re going to just start trying stuff. And I think everyone should just start trying the ideas that they have, and we should all tell each other what’s working.

How’D You Get Started?

Mike Schwartz: How did you get started in the tech industry?

Kathryn Erickson: Well, my dad taught Computer Science, Community College, and I was going to be a DNA researcher. And I just wasn’t very good at it, and I thought, “You know what dad’s over Computer Science, we’ve been playing with computers all of our lives.” That sounds more like playing then working, it’s been that way ever since. It feels more like playing than working every day,

Mike Schwartz: That’s great. Thank you so much for joining us today, Kathryn, and sharing your insights. And best of luck at DataStax.

Kathryn Erickson: Sure. Thank you.

Closing

Mike Schwartz: Thanks to the DataStax PR team for helping us to schedule some time with Kathryn.

Editing by Ines Cetenji. Transcription by Marina Andjelkovic. Cool graphics by Kamal Bhattacharjee. Music from Broke For Free, Chris Zabriskie and Lee Rosevere.

Next episode we’re excited to have Cornelia Davis, author of Cloud Native Patterns, a Manning book that needs to be on every software architect’s bookshelf. She’s also the CTO of Weaveworks. She was fantastic, so don’t miss it. Until next time, thanks for listening, and stay safe.