Lex Fridman Podcast - #224 – Travis Oliphant: NumPy, SciPy, Anaconda, Python & Scientific Programming
Episode Date: September 23, 2021Travis Oliphant is a data scientist, entrepreneur, and creator of NumPy, SciPy, and Anaconda. Please support this podcast by checking out our sponsors: - Novo: https://banknovo.com/lex - Allform: http...s://allform.com/lex to get 20% off - Onnit: https://lexfridman.com/onnit to get up to 10% off - Athletic Greens: https://athleticgreens.com/lex and use code LEX to get 1 month of fish oil - Blinkist: https://blinkist.com/lex and use code LEX to get 25% off premium EPISODE LINKS: Travis's Twitter: https://twitter.com/teoliphant Travis's Wiki Page: https://en.wikipedia.org/wiki/Travis_Oliphant NumPy: https://numpy.org/ SciPy: https://scipy.org/about.html Anaconda: https://www.anaconda.com/products/individual Quansight: https://www.quansight.com PODCAST INFO: Podcast website: https://lexfridman.com/podcast Apple Podcasts: https://apple.co/2lwqZIr Spotify: https://spoti.fi/2nEwCF8 RSS: https://lexfridman.com/feed/podcast/ YouTube Full Episodes: https://youtube.com/lexfridman YouTube Clips: https://youtube.com/lexclips SUPPORT & CONNECT: - Check out the sponsors above, it's the best way to support this podcast - Support on Patreon: https://www.patreon.com/lexfridman - Twitter: https://twitter.com/lexfridman - Instagram: https://www.instagram.com/lexfridman - LinkedIn: https://www.linkedin.com/in/lexfridman - Facebook: https://www.facebook.com/lexfridman - Medium: https://medium.com/@lexfridman OUTLINE: Here's the timestamps for the episode. On some podcast players you should be able to click the timestamp to jump to that time. (00:00) - Introduction (07:06) - Early programming (28:47) - SciPy (45:41) - Open source (57:23) - NumPy (1:34:39) - Guido van Rossum (1:46:57) - Efficiency (1:55:49) - Objects (2:02:47) - Numba (2:11:53) - Anaconda (2:16:20) - Conda (2:31:56) - Quansight Labs (2:35:32) - OpenTeams (2:43:05) - GitHub (2:48:35) - Marketing (2:53:13) - Great programming (3:04:03) - Hiring (3:08:01) - Advice for young people
Transcript
Discussion (0)
The following is a conversation with Travis Allefund, one of the most impactful programmers
and data scientists ever.
He created Numpi, SciPi, and Anaconda.
Numpi formed the foundation of Tensor-based machine learning in Python, SciPi formed the
foundation of scientific programming in Python, and Anaconda, specifically with Konda, made
Python more accessible to a much larger audience.
Travis's life work across a large number of programming and entrepreneurial efforts
has and will continue to have a measurable impact on millions of lives by empowering scientists
and engineers in big companies, small companies, and open source communities to take on difficult
problems and solve them with the power of programming. Plus, he's a truly kind human being,
which is something that when combined with vision and ambition makes for great leader and a great
person to chat with. To support this podcast, please check out our sponsors in the description.
to support this podcast, please check out our sponsors in the description. As usual, I'll do a few minutes of ads now, no ads in the middle. I try to make these
interesting so hopefully you don't skip, but if you do, please still check out the sponsor
links in the description. It does happen to be the best way to support this podcast.
I use this stuff, I enjoy it, maybe you will too. This show is brought to you by Novo, which is a business banking app.
The process is simple, you sign up, they'll mail you a Novo debit card, and you get free
ATM use.
Honestly, if there's any industry that needs disruption, it's the old school banking industry,
and that's exactly what Novo does.
It's backed by FDIC insurance
like the old school banks but there's no hidden fees, no monthly fees or
minimum balance requirements. It has an easy to use mobile app. You can apply in
under 10 minutes. There's always human-powered customer service. I like the way
that sounds, human-powered. Free transfers, mailed checks, and incoming wires.
Intergrades with other small business tools like Stripe, Shopify, QuickBooks, and many
more.
Refunds all AT&T fees, like I said, thousands of dollars and exclusive perks go to bank
nova.com slash lex to sign up for free.
That's bank nova.com slash Lex. I highly encourage you go there
and support the disruption of the old school banking industry.
This show is also sponsored by Allform, a furniture company. They ship to your home quickly,
take it back for free if you don't like it in the first 100 days. It's easy to assemble,
looks beautiful and classy and it feels amazing.
I love it.
I have one of their love see, it's leather, it's black, it looks gorgeous, it feels great.
I hang out with a lot of interesting people on that love see.
There's something about the dynamics of a love see.
You do have a little bit of space between you, but you're just close enough to where you
have to contend
with the challenges of human intimacy. For example, I hang out with Michael Malice on that love seat,
although he specifically refuses to acknowledge the emergent humanity in this particular
computer's soul. Anyway, go to allform.com slash Lex.
They're offering 20% off all orders.
If you go to allform.com slash Lex, there's small coaches there, big ones, and it's beautiful
and it feels great.
This episode is also brought to you by Onit, Nutrition Supplement and Fitness Company. They make Alpha Brain, which is a Neutropic that helps support Nutrition, supplement, and fitness company. They make alpha brain, which
is a newtropic that helps support memory, mental speed, and focus. I take it when I have a
difficult deep work session coming up. And I really need help to get my mind to that
place of clarity of focus. I don't like to overcaffeinate myself and get jittery. I like
what alpha brain does. It's like a steady focus that I get from it. I don't use it every day, so it's not part of
my ritual, it's more like a jet pack, a super boost that I use when I know the session
is going to be difficult, mentally challenging. Obviously I first heard about on it on the
Joe Rogan experience, so it feels surreal to be doing an ad read for them now.
But I really enjoyed this stuff for a long time, so I'm proud and happy.
Anyway, go to lexfreemian.com slash on it to get up to 10% off alpha brain.
That's lexfreemian.com slash on it.
This show is also brought to you by Athletic Greens and its new renamed AG1 Drink, which
is an all in one daily drink to support better health and peak performance.
It replaced the multivitamin for me and went far beyond that with 75 vitamins and minerals.
There's so much good stuff in there.
It tastes delicious.
It's now like an integral part of my day.
I drink it now at least twice a
day and make sure that I get the nutritional base for all the crazy things I do in terms
of athletic pursuits, in terms of long sessions of deep work and also having a keto or mostly
a carnivore diet. So get the nutrition that my mind, my body needs, given all the diet, given all the hard
work I have to undertake.
Anyway, they will give you one month supply of fish oil when you sign up at AthleticGreens.com
slash Lex, that's AthleticGreens.com slash Lex.
This episode is also brought to you by Blinkist, my favorite app for learning new things.
Blinkist takes the key ideas from thousands of non-fiction books and condenses them down
into just 15 minutes that you can read or listen to.
I love reading slowly, reading through the whole book and thinking deeply through every
page, through every paragraph. But coupled that, I also love using Blinkist to pick which books I'm going to read next,
and using Blinkist to review the key insights from the books I've already read.
So because I read so slowly and so deeply, I can't possibly afford to read a large number of books. So I rely on Blinkist to help me remember the
key insights from books and to help pick future books. Go to Blinkist.com slash Lex to
start your free seven day trial and get 25% off of a Blinkist Premium membership. That's
Blinkist.com slash Lex. This is the Lex Friedman Podcast and here is my conversation with Travis Allefund.
What was the first computer program you've ever written? Do you remember?
Whoa, it's a good question.
I think it was in fourth grade, just a simple loop in basic.
Basic.
Basic, yeah, an Atari 400, I think, or maybe an Atari 800.
It was part of a class, and we just were just basic loops
to print things out. Did you use go-to statements? Yes, yes, we use go-to statements. I remember
in the early days, that's when I first realized there's like principles to programming when
I was told that don't use go-to statements. Those are bad software engineering prints,
like it goes against what great beautiful
code is. I was like, oh, okay, there's rules to this game. I didn't see that until high school
when I took an AP computer science course. I did a lot of other kinds of just low program
and TI. But finally, when I took an AP computer science course in Pascal. Wow. That's, yeah,
it was Pascal. That's when I, oh, there are these principles. Not C or C++?
No, I didn't take C until the next year in college,
I had a course in C, but I haven't done much in Pascal,
just that AP computer science course.
Now, sorry for the romanticized question,
but when did you first fall in love with programming?
Oh, man, good question.
I think actually when I was 10, you know, my dad got us a T.I.M.X.
in Claire, and he was excited about the spreadsheet capability, and then, but I made him get the
basic add-ons we could actually program in basic, and just being able to write instructions
and have the computer do something.
And we got a TI9 and that TI 994A when I was about 12.
And I would just, it had sprites
and graphics and music. You could actually program and do music. That's when I really
sort of fell in love with programming. So this is a full, like, a real computer with, like,
with memory and storage. Yeah. Processors would not get the time at TI. It's not. The time at
St. Clair was one of the very first, it was a cheap cheap, I think it was, well,
it was still expensive, but it was 2K of memory, we got the 16K out on pack.
But yeah, it had memory and you could program it.
You had the, in order to store your programs, you had to attach a tape drive.
Remember that old, the sound that would play when you converted the modem, it would convert
digital bits to audio files, it was set on tape drive.
Still remember that sound, but that was the storage.
And what was the programming language, do you remember?
It was basic.
It was basic.
And then they had a busy calc.
And so a little bit of spreadsheet programming
and busy calc, but mostly just some basic.
Do you remember what kind of things drew you to programming?
Was it working with data?
Was it video games?
And video games?
Math. Math. Math.
Math. Math. Math.
Yeah, I've always loved math. And a lot of people think they don't like math because I think
when they're exposed to it early, it's about memory. When you're exposed to math early,
you have a good short-term memory, you can remember it's time tables. And I do have a reasonably,
I'm not perfect, but a reasonably long little short-term memory
buffer.
And so I did great at time tables.
I said, oh, I get a math.
But I started a really like math, just the problem solving aspect.
And so computing was problem solving applied.
And so that's always kind of been the draw, kind of coupled with the mathematics.
Did you ever see the computer as like an extension of your mind, like something able to achieve? Not till later.
Okay.
Yeah, not then.
It's just like a little set of puzzles that you can play with and you can, you can play with math puzzles.
And yeah, it was, it was too rudimentary early on.
Like it was sort of, yeah, it was too, it was a lot of work to actually take a thought you'd have and actually get it implemented.
And that's still work, but it's getting easier. And so, yeah, I would say that's definitely what's
attraction to me to Python, is that that was more real, right? I could think in Python.
Speaking of foreign language, I only speak another language fluently besides English, which is
Spanish. And I remember the day when I would dream in Spanish. And you start to think in that language.
And then you actually, I do definitely believe that language limits or expands your thinking.
There are some languages that actually lead you to certain thought processes.
Yeah, like, so I speak Russian fluently. And that's certainly a language that leads you down certain
thoughts.
Well, yeah, I mean, there's a history of the two world wars, of the millions of people
starving to death or near to death throughout this history of suffering, of injustice,
like this promise sold to the people and then the carpet or whatever
it's swept from under them. It's like broken promises and all of that pain and melancholy
is in the language, the sad songs, the sad hopeful songs, the over romanticized, like,
I love you, I hate you, the sort of the swings between all the various spectrums of emotion. So that's all within the language. It's twisted,
there's a strong culture of rhyming poetry. So like the bar, there's a
musicality to the language too.
Dostoevsky, right in Russian?
Yeah, so like, Dostoevsky, Tostoy, all the...
All the ones that I know about, which are translated,
and I'm curious how the translations...
So, Dostoevsky did not use the musicality of the language
too much, so they actually translate pretty well
because it's so philosophically dense
that the story does a lot of the work,
but there's a bunch of things that are untranslatable. Certainly the poetry is not translatable. I actually have a few conversations
coming up offline and also in this podcast with people who've translated Dusty Eski.
And that's for people who worked, who worked in this field, know how difficult that is.
Sometimes you can spend, you know, months
thinking about a single sentence, right? In context, like, because there's just a magic
capture by that sentence, and how do you translate just in the right way? Because those words
can be, can be really powerful. There's a famous line, beauty will save the world from
Dostoyevsky. You know, there's so many ways to translate that. And you're right.
The language gives you the tools with which to tell the story, but it also leads your mind
down certain trajectories and paths to where over time, as you think in that language, you
become a different human being.
Yes.
Yeah. That's a fascinating reality, I think. I know people have explored that, but it's
just rediscovered. Well, we don't, we live in our own like little pockets.
Like this is the sad thing.
Is I feel like unfortunately, given time
and given getting older, I'll never know the China,
the Chinese world.
Because I don't truly know the language.
Same with Japanese, I don't really know Japanese
and Portuguese and Brazil, that whole South American continent. like, yeah, I'll technical world is in English,
and so much of it might be lost because we don't have the common language.
I completely agree. I'm very much in that vein of, there's a lot of genius out there that we miss,
and it's sort of fortunate when it bubbles up into something that we can understand or process.
There's a lot we miss.
So it's why I tend to lean towards really loving
democratization or things that empower people
or I'm very resistant to sort of authoritarian structures.
Fundamentally for that reason, it's, well,
several reasons, but it just hurts us.
We're worse off.
So speaking of languages that empower you.
So I thought it was the first language for me that
that I could I really enjoyed thinking in. Yeah. As you said, something you shared my experience too.
So when did you first do you remember when you first kind of connected with Python maybe you
even fell in love with Python? It's a good question. It was a process that took about a year. I first
encountered Python in 1997. I was a graduate student studying biomedical engineering at the Mayo Clinic and I had previously I've been involved in
Taking information from satellites. So it was an electrical engineering student
Used to taking information and trying to get something out of it doing some data processing information out of it
And I've done that in MATLAB. I've done that in Pearl. I've done that in
information out of it. I've done that in MATLAB. I've done that in Pearl. I've done that in scripting on a VMS. There's actually a Vax VMS system and they had their own little scripting tools around Fortran.
I've done a lot of that. And then as a graduate student, I was looking for something and counter python.
And because python had an array, it had two things that made me not filter it away.
Because I was filtering a bunch of stuff. It was Yorick, I looked at Yorick, I looked at a few
other languages throughout there at the time in 1997.
But it had arrays, there's a library called an numeric that had just been written in 95,
like not very much earlier.
By an MIT alum, Jim Hugenin, when I went back and read the mailing list to see the history
of how it grew, and
there was a very interesting fascinating to do that, actually, to see how this emergent
cooperation, unstructured cooperation happens in the open-source world that led to a lot of
this collective programming, which is something we might get into a little later, but what that
looks like. What gap did numeric fill? Merit till the gap of having an array object.
So instead there's no array object.
There was no array, there was a one-dimensional byte concept, but there's no end-dimensional,
two, three, four-dimensional tensor they call it now.
I'm still in the category that it tensors another thing and it's just an NDRA.
We should call it, but kind of lost that battle.
And just many battles in this world,
some which will win, some will lose.
That's exciting.
So, and, but it was, it had no math to it.
So, an America had math and a basic way to think in a race.
So, I was looking for that and it had complex numbers.
A lot of programming languages.
And you can see it because, you know,
if you're just a computer scientist, you think complex numbers are just too float.
So people can build that on.
But in practice, a complex number
as one of the significant algebraes that helps connect
a lot of physical and mathematical ideas,
particular to FFT for an actual engineer.
And it's a really important concept.
And not having it means you have to develop it several times.
And those times may not share an approach.
One of the common things in programming,
one of the things programming enables is abstractions.
But when you have shared abstractions,
it's even better.
It sort of gets to the level of language
of actually we all think of this the same way,
which is both powerful and dangerous, right?
Because powerful and that we now can quickly make bigger
and higher level things on top of those abstractions
dangerous because it also limits us
as to the things we left, maybe left behind
in producing an abstraction, which is at the heart
of programming today and actually building
around the programming world.
So I think it's a fascinating philosophical topic.
Yeah, they will continue for many years.
I think it's different as it builds more and more and more abstractions. Yes, I often think about, you know, we have,
we have a world that's built on these abstractions that were they the only ones possible?
Certainly not, but they led to, you know, it's very hard to do it differently. Yeah.
Like there's an inertia that's very hard to, you know, push out, push away from. There's,
there's implications for things like, you know, the Julia language, which you've heard of, I'm sure. And I've met the creators and I like
Julia. It's a really cool language, but they've struggled to kind of against the, just the
tide of like this inertia people using Python. And, you know, there's strategies to approach
that, but nonetheless, it's a, it's a phenomenon. And sometimes, so I love complex numbers and
I love to raise. So I, I looked at Python. And sometimes, so I love complex numbers, and I love to raise, so I looked at Python.
And then I had the experience,
I did some stuff in Python, and I was just doing my PhD,
so I was, my focus was on,
I was actually doing a combination of MRI and ultrasound
and looking at a phenomenon called elastography,
which is you push waves into the body
and observe those waves, like you can actually measure them.
And then you do mathematical inversion to see what the elasticity is.
And so that's the problem I was solving is how to do that with both ultrasound and MRI.
I needed some tool to do that with.
So I started doing this Python in 1997.
In 1998, I went back, looked at what I'd written and realized I could still understand it,
which is not the experience I've had
when doing Pearl in 95, right?
I'd done the same thing, and then I looked back,
and I'd forgotten what I was even saying.
Now, you know, I'm not saying it.
So that, I mean, hey, this may work.
I like this.
This is something I can retain
without becoming an expert per se.
And so that led me to go, I'm gonna push more to this.
And then that 98 was kind of the,
when I started to fall in love with Python, I would say.
A few peculiar things that buy Python.
So maybe compare it to Perl,
compare it to some of the other languages.
So there's no braces.
Yeah.
So space is used indentation, I should say, is used as part of my language.
Right. So did you, I mean, that's quite a leap, the way you comfortable that leap or
you just very open mind. It's a good question. I was open minded. So it was cognizant of
the concern. And it definitely has specific challenges,
cut in pasting, for example, you're cut in pasting code.
And if your editors aren't supportive of that,
or you're put into a terminal,
and particularly in the past, when terminals
didn't necessarily have the intelligence to manage it now.
Now, I pith on a Jupyter Notebooks panel that just finds,
there's really no problem, but in the past,
it creates some challenges, formatting challenges.
Also, mix created some challenges, formatting challenges, also mixed tabs and
spaces.
If editors weren't, you weren't clear what was happening, you would have these issues.
So there were really concrete reasons about it that I heard and understood.
I never really encountered a problem with it.
Personally, like it was occasional annoyances, but I really liked the fact that I didn't
have all this extra characters, right?
That these extra characters didn't show up in my visual field when I was just trying to process understanding a snippet of code.
Yeah, there's a cleanness to it.
But I mean, the idea is supposed to be that Pearl also has a cleanness to it because of the minimalism of like how many characters it takes to express a certain thing.
Yeah.
So it's very compact.
Yeah. many characters it takes to express a certain thing. So it's very compact. But will you realize
that that compactness comes, there's a culture that prizes compactness. And so the code gets
more and more compact and less and less readable to a point where it's like, like to be a good
programmer and Pearl, you write code that's basically unreadable. There's a culturally. Correct. And you're proud of it.
Yeah, you're proud of it.
Right, exactly.
And it's like feels good.
And it's really selective.
It means you have to be an expert in parole to understand it.
Whereas Python was allowed you not to have to be an expert.
You don't have to take all this brain energy.
You could leverage what I say.
You could leverage your English language center,
which you're using all the time.
I've wondered about other languages, particularly non-latin-based languages.
Latin-based languages with the characters are at least similar.
I think people have an easier time, but I don't know what it's like to be a Japanese or
a Chinese person trying to learn a different syntax.
Like what would computer programming look like in that?
I haven't looked at that at all
But it certainly doesn't you know leveraging your your Chinese language center
I'm not sure about Python or any program names does that
But that was a big deal the fact that was accessible. I could be a scientist what I really liked is
Many programming languages really demand a lot of you and you can get a lot
You know you do a lot if you learn it the Python enables you to do a lot without demanding a lot of you
do a lot if you learn it. The Python enables you to do a lot without demanding a lot of you.
There's nuance to that statement, but it certainly was more accessible. So more people could actually, as a scientist, as someone who is trying to solve another problem besides
programming, I could still use this language and get things done and be happy about it.
Now, I was also comfortable and see at that time.
And Matlab, you do a little bit of things.
And Matlab I did a lot before that, exactly.
So I was comfortable in those three languages
were really the tools I used during my studies and schooling.
But to your point about language helping you think,
one of the big things about Matlab was it was
and APL before it.
I don't know if you remember APL.
APL is actually the predecessor of a Ray-based programming, which I think is really an underappreciated.
If I talk to people who are just steeped in computer programming and computer science,
like most of the people that Microsoft has hired in the past, for example,
Microsoft as a company generally did not understand a Raybase programming,
like culturally they didn't understand it. So they kept missing the boat, kept missing the
understanding of what this was. They've gotten better, but there's still a whole culture of folks that doesn't
program. I know that's systems programming or web programming or lists and maps and
it would about an end-dimensional array. Oh yeah, that's just an implementation detail.
Well, you can think that, but then actually if you have that as a construct, you actually think
differently.
APL was the first language to understand that, and it was in the 60s.
The challenge of APL is APL had very dense, not only glyphs, like new characters, new
glyphs, they even had a new keyboard because to produce those glyphs, this is back in the
early days in computing when the Quarity keyboard maybe wasn't as a style, it's like, well,
we could have a new keyboard, no big deal.
But it was a big deal.
And it didn't catch on.
And the language, APL, very much like Pearl,
as people would pride themselves in how much,
could they write the game of life in 30 characters of APL?
APL has characters that mean summation.
And they have adverbs.
You know, they don't have additives.
And these things called adverbs, which are like methods, like reduction, reduction would be an adverb on have adverbs. You know, they have adverbs and these things called adverbs,
which are like methods, like reduction,
reduction would be an adverb on an ad operator.
Right, so, but using these tools,
you could construct and then you start to think
at that level, you think an end of mentions,
it's something I like to say,
and you start to think differently
about data at that point.
You know, now it really helps.
Yeah, I mean, I was out of programming. If you really internalize linear algebra as a course,
I mean, it's philosophically allows you to think of the world differently.
Yes.
It's almost like liberating.
You don't have to, you don't have to think about the individual numbers
in the end-dimensional array.
You could think of it as an object in itself,
and all of a sudden this world can open up.
You're saying MATLAB and AP APL word like the early seat.
I don't know if many languages got that right ever.
No, no, no, they didn't.
Even still, even still, I would say, I mean, NumPy is a,
is an inheritor of the traditions that I would say APL J was
a, another version that was what it did is not have the glyphs,
just have short characters, but still a Latin keyboard could type them and then numeric
inherited from that in terms of let's add arrays plus broadcasting plus methods
reduction and even some of the language like rank is a concept that's in that was
in Python is still in Python for the number of dimensions right that's
that's different than say the rank of a matrix, which people think of as well. So it's, it came from that tradition, but NumPy is a very pragmatic, practical tool.
NumPy inherited from America and we can get to where NumPy came from, which is the current
array, at least current as of 2016-2017. Now there's a ton of them over the past two or
three years. We can get an adapt to so if we just
So the linger on the early days of what was your favorite feature of Python? Do you remember like what yeah?
So it's so interesting to linger on like that
what
What really makes you connect with a language? I'm not sure it's obvious to introspect that. No, it isn't.
And I've thought about that.
It's a some length.
I'm not, I think definitely the fact
that I could read it later,
that I could use it productively
without becoming an expert.
And you, other languages,
I had to put more effort into.
Right.
That's like an empirical observation.
Like you're not analyzing any one aspect of the language.
It just seems time after time,
to look back, it's somehow readable.
It's somewhat readable, then it was sort of,
I could take executable English and translate
to Python more easily.
I didn't have to go, there was no translation layer.
As an engineer or as a scientist,
I could think about what I wanted to do,
and then the syntax wasn't that far behind it.
Yeah.
Right?
There are some warts there still.
It wasn't perfect. There's some areas
where I'm like, I'll be better if this were different or if this were different. Some of
those things got out of the language too. I was really grateful for some of the early
pioneers in the Python ecosystem back because Python got written in 91 as when the first
version came out. But Gito was very open to users. And one of the sets of users were people
like Jim Hugenin and David Asher and Paul De Bois and
Conrad Hinson these were people that were on the main list and they were just asking for things like hey
We really should have complex numbers in this language. So let's you know
There's a J there's a one J right and they in fact they went the engineering route of J is interesting
I don't think that's entirely favorite engineers
I think it's because I is so often used as the index of a for loop.
I think that's actually what I'm probably right.
I mean, there's a pragmatic aspect.
Well, the fact that complex numbers were there.
I love that.
The fact that I could write NDA ray constructs and that reduction was there.
Very simple to write summations and broadcasting was there.
I could do addition of whole arrays.
So that was cool. Those
were some examples about it. I don't know what to start talking to you about because you've
been you've created so many incredible projects that basically changed the whole landscape of programming.
But okay, let's start with let's go chronologically with sci-fi. You created sci-fi over two decades ago now.
Yes, right? Yes, I love to talk about sci-fi.
Sci-fi was really my baby.
What was its goal?
What is its goal?
How does it work?
Yeah, fantastic.
So sci-fi was effectively, here I'm using Python,
to do stuff that I previously used MATLAB to use.
And I was using numeric, which is an array library
that made a lot of it possible.
But there's things that were missing.
Like, I didn't have an ordinary differential equation
solver I could just call.
I didn't have integration.
Yeah, I wanted to integrate this function.
OK, well, I don't have just a function I can call to do that.
These are things I remember being critical things
that I was missing.
Optimization, I just want to pass a function to an optimizer
and tell me what the optimal value is. Those are things, like, well, why don't we just write a library that has these tools?
And I started to post on the mailing list and there previously been people have discussed
and I remember Conrad Henson saying, wouldn't it be great if we had this optimizer library
here?
David Ash would say this stuff.
And I'm ambitious, and this is the wrong word, an eager, and probably more time than sense. I was
a poor graduate student. My wife thinks I'm working on my PhD, and I am, but part of a
PhD that I loved was the fact that it's exploratory. You're not just taking orders, fulfilling
a list of things to do. You're trying to figure out what to do. I thought, well, I'm running
tools for my own use in a PhD. So I'll just start
this project. And so in 1999, 1998 was when I first started the right libraries for Python.
Luckily, when I fell in love with Python 98, I thought, well, there's a few things missing.
Like, oh, I need a reader to read die-com files. That was in medical imaging and die-com was a format
that I wouldn't be able to reload that into Python. Okay. How do I write a reader for that? So I wrote
something called, it was an IO package, right?
And that was my very first extension module, which is C.
So I wrote C code to extend Python
so that the Python, I could write things more easily.
That combination kind of hooked me.
It was the idea that I could, here's this powerful tool
I can use as scripting language
and a high level language to think about,
but that I can extend easily, easily in this
NC, that easily for me because I knew enough C. And then Gito had written a link, I mean,
the only at the hard part of extending Python was something called the way memory management
networks, and you have to reference counting. And so there's a tracking of reference counting
you have to do manually. And if you don't, you have memory leaks. And so that's hard, plus
then C, you know, it's much more,
you have to put more effort into it.
It's not just, I have to now think pointers
and have to think about stuff that is different.
I have to kind of, you're like putting a new cartridge
in your brain.
Like, okay, I'm thinking about MRI.
Now I'm thinking about programming them.
And they're distinct modules end up having to think about.
So it's harder.
When I was just in Python, I could just think about MRI
and high level wrote writing. But I could do that. And that about. So it's harder. When I was just in Python, I could just think about MRI and high-level road riding.
But I could do that.
And that kind of, I liked it.
I found that to be enjoyable and fun.
And so I ended up, oh, let me just add a bunch of stuff
to Python to do integration.
Well, and the cool thing is, is that the power of the internet,
I just looking around, and I found, oh, there's
this net live, which has hundreds of 4-channel routines
that people have written in the 60s and the 70s and the 80s.
In 4chan 77, fortunately, it wasn't 4chan 60s.
It had been reported to 4chan 77.
And 4chan 77 is actually a really great language.
4chan 90 probably is my favorite 4chan.
Because it's got complex numbers, got arrays,
and it's pretty high level.
Now, the problem with it is you'd never want to write a program
in 4.90 or 4.70, but it's totally fine to write a subroutine in.
And then 4.10 kind of got a little off course
when they tried to compete with C++.
But at the time, I just want libraries to do something.
Like, oh, here's an order for equation.
Here's integration.
Here's run, cut, integration.
Already done.
I don't have to think about that algorithm.
I mean, you could, but it's nice to have somebody
who's already done one and tested it.
And so I sort of started this journey in 98, really.
I've looked back at the manual list.
There's sort of this productive era of me
writing an extension module to connect runch cut integration
to Python and making an ordinary digital equation solver.
And then releasing that as a package,
so we could call ODE pack, I think I called it,
then quad pack.
And then I just made these packages.
Eventually that became multi-pack,
because they're originally modular.
You can install them separately,
but a massive problem in Python
was actually just getting your stuff installed.
At the time, releasing software for me,
today, it's people think, what does that mean?
Well, then it meant some poorly written web page, I had some bad web page up and I put a tarball, just a gzip tarball
of source code. That was the release. But okay, can we just stand that because that
the community aspect of creating the package and sharing that? Yes. That's rare. That
to have to both at that time. So like the Ross pretty early. Yeah. Oh, well,'s rare. That to have to both have them at that time.
So like the raw is pretty early.
Yeah.
So, oh, well, not rare.
Maybe, maybe you can correct me on this, but it seems like in the scientific
communities, so many people, you were basically solving the problems you needed to
solve to process the particular application, the data that you need.
And to also have the mind that I'm going to make
this usable for others, that's.
I would say I was inspired. I'd been inspired by Linux, I've been inspired by, you know,
Linux and him making his code available. And I was starting to use Linux the time. I
went, this is cool. So I'd kind of been previously primed that way. And generally I was, I was
into science because I like the sharing
notion. I like the idea of, hey, let's, if collectively we build knowledge and share it,
we can all be better off.
Okay. So you want to energize about that?
So it's energize about it already. Yeah. Right. And I can't deny that. I was. I'm sort
of had this very, I liked that part of science, that part of sharing. And then I always
said, Oh, wait, here's something. And here's something I could do. And then I slowly over years learned how to share better so that you could
actually engage more people faster. One of the key things was actually giving people
a binary they could install, right? So that wasn't just your source code. Good luck.
Compile this and then it's compiled very to install. You know, so a lot of the journey
from 98, even through 2012, when I I started Anaconda was about that.
Like it's why, you know, it's really the key is to why
a scientist with dreams of doing MRI research
ended up starting a software company that
installs software.
I work with a few folks now that don't program
like on the creative side and the video side,
the audio side and because my whole life is running on scripts, I have to try to get them,
I have now the task of teaching them how to do Python enough to run the scripts.
And so I've been actually facing this, whether it's on the conduits, some,
with the task of, how do I minimally explain, basically to my mom,
how to write a Python script?
And it's an interesting challenge. After that, it's a to do item for me to figure out, like, what to my mom, how to write a Python script. And it's an interesting challenge.
After that, it's a to-do item for me to figure out like, what is the minimal
modern information I have to teach? What are the tools you use? Like, one, you enjoy it,
two, you're effective at, and then the debugging, like the iterative process of running the script
to figure out what the error is, maybe even for some people to do the fix yourself.
So do compile it, do you distribute that code to them?
It's interesting because I think it's exactly what you're talking about.
If you increase the circle of empathy, the circle of people that are able to use your programs,
you increase it.
It's like affecting this and it's power.
So you have to think, can I write scripts, can I write programs that can be used by medical
engineers, by all kinds of people that don't know programming?
And actually maybe plant a seed, have them catch the bug of programming so that they start
on their journey.
That's a huge responsibility.
And ultimately has to do with the Amazon one click by, like how frictionless can you make
the early steps?
Frictionless is actually really key.
To go on any community is every any friction point, you're just going to lose some people.
Yeah.
Right.
Now, sometimes you may want to intentionally do that.
If you're early enough on, you need a lot of help.
You need people who have the skills.
You might actually, it's helpful.
You don't necessarily have too many users,
as opposed to contributors, if you're early on.
Anyway, there's a sci-fi start in 98,
but it really emerged as this collection of modules
that I was just putting on the net.
People were downloading.
And I think I got 100 users by the end of that year.
But the fact I got 100 users and more than that,
people started to email me with fixes.
That was actually intoxicating.
That was the, here I'm writing papers,
I'm giving conferences and I get people to say hello,
but yeah, good job.
But mostly it was, you're reviewed with, it's competitive.
You publish a paper and people were like, oh, I wasn't my paper.
I was starting to see that sense of academic life where it was so much, I thought there
was a cooperative effort, but it sounds like we're here just to one up each other.
And it's not true across the board, but a lot of that's there.
But here in this world, I was getting responses from people all over the world.
I remember PR-Opetersen in Estonia, right?
It was one of the first people.
And he sent me back this make file.
Because the first thing it is, you're building stinks.
Here's a better make file.
Now, it was a complex make file.
I never understood that make file, actually.
But it worked, and it did a lot more.
And so I said, thanks, this is cool.
And that was my first kind of engagement
with community development.
But the process was, he sent me a patch file,
I had to upload a new tar ball.
And I just found, I really love that.
And the style back then was, here's a main list.
It was very, it wasn't as,
there's certainly weren't the tools
that are available today.
It was very early on.
But I really started that the whole year,
I think I did about seven packages that year.
And then by the end of the year
I collected them into a thing called multi-pack.
So 99 there was this thing called multi-pack
and that's when a high school student,
and always a high school student at the time,
getting Robert Kern, took that package
and made a Windows installer, right?
And then of course a massive increase of usage.
So by the way, most of this development was under Linux.
Yes, yes, it was on Linux.
I was a Linux developer doing it on a Linux box.
I mean, at the time I was actually getting into
a new hard drive, just some kernel programming
to make the hard drive work.
I mean, not programming, but modification to the kernel
so I can actually get a hard drive working.
I love that aspect of it.
I was also in school, I was building a cluster,
I took Mac computers, and you put yellow dog Linux on them.
At the Mayo Clinic, they were all these Macs,
they were older, they were just getting rid of,
and so I kind of got permission to go grab them together,
I put about 24-odd them together,
and a cluster, and a cabinet, and put yellow dog 24 of them together, and a cluster and a cabinet,
and put yellow dog Linux on them all,
and I wrote a C++ program to do MRI simulation.
That was what I was doing at the same time
for my day job, so to speak.
So I was loving the whole process.
At the same time, I was, oh, I need an ordinary
differential equation.
That's why ordinary differential equations were key,
was because that's the heart of a block equation
for similar to MRI is a ODE solver and so that's
But I actually did that those happen the same time. That's why it kind of what you're working on and what you're interested in
They're coinciding. I was definitely scratching my own itch in terms of building stuff and
Which helped in the sense that I was using it for me. So at least I had one user. Yeah, I had one person
I was like, well, no, this is better.
I like this interface better.
And I had the experience of MATLAB to guide some of what
those APIs might look like.
But you're just doing yourself.
You're building all this stuff.
But the Windows install, it was the first time I realized,
oh, yeah, the binary install really helps people.
And so that led to spending more time on that side of things.
So around 2000. So I graduated my PhD in 2000,
end of 2000.
So 99 doing a lot of work there, 98 doing a lot of work there,
99 kind of spending more time on my PhD,
helping people use the tools, thinking about
what do I want to go from here.
There was a company, there was a guy actually,
Eric Jones and Travis Vott, they were two friends
who founded a company called Enthod.
It's here in Austin, still here.
And they, Eric contacted me at the time when I was a graduate student still and he said,
hey, once you come down, we want to build a company, we're thinking of a scientific company
and we want to take what you're doing and kind of add it to some stuff that he'd done.
He'd written some tools. And then PR, Peter, Sen then F2Py. It's come together and build,
pull this all together and call it SciPy. So that's the origin of the SciPy brand. It came from
you know, multi-pack and a whole bunch of modules I'd written, plus a few things from some other
folks and then pull together in a single installer. SciPy was really a distribution of Python, masquerading as a library.
How did you think about SciPy in context of Python,
in context of numeric?
What?
So we saw SciPy as a way to make an R&D environment
for Python.
Like, use Python, dependent on numeric.
So numeric was the array library we dependent on.
And then from there, extend it with a bunch of modules
that allowed for, and at the time, the original vision of SciPy was to have plotting,
was to have, you know, the repel environment and kind of a whole,
really a whole data environment that you could then install and get going with.
And that was kind of the thinking.
It didn't really evolve that way, right?
It sort of had a, but one, it's really hard to do massive scale projects with open source
collectives.
They actually, there's a sort of intrinsic cooperation limit as to which too many cooks in the
kitchen, you can do amazing infrastructure work.
When it comes to bringing it all together into a single deliverable, that actually requires
a little more product management that doesn't really emerge from the same dynamic.
So it's struggle, it's struggle to get,
there are almost too many voices, it's hard to have everybody agree,
you know, consensus doesn't really work at that scale,
you end up with politics, you know, with the same kind of things
that's happened in large organizations trying to decide
on what to do together.
So consensus building was still, was challenging at, as more people came in, early on,
it's fine because there's nobody there.
And so it works.
But then you get more successful and more people use it.
All of a sudden, oh, there's this scale at which this doesn't work anymore.
And we have to come up with different approaches.
So Side by came out officially in 2001 was the first release, most of the time.
I remember the days of getting that release ready.
It was a Windows installer and there were bugs on how the Windows Compiler handled complex
numbers and you were chasing segmentation faults.
It was a lot of work.
There was a lot of effort.
It had nothing to do with my area of study.
At the same time, I had just gotten an offer.
He wondered if I wanted to come down and help him start that company with his friend.
And at the time, I was like, oh, intrigued, but I was squaring a path, an academic path.
And I just got an offer to go and teach at my old model.
So I took that tenure track position and SciPy, and then I started working on SciPy as
a professor too.
Okay.
So that's, I left, I've got the Mayo Clinic graduate, wrote my thesis using SciPi, wrote, you
know, there's, there's images that were created.
Now, the plotting tool I use was something from Yorick actually.
It was a plotting, a PLT, I have a plotting language that I used.
A Yorick is a programming language?
It was a programming language.
I had a plotting tool, Dyslin, it, you have integration to Dyslin.
I ended up using Dyslin plus some plus some of the plotting from Yorick,
link to from Python.
Anyway, it was, people don't plot that way now.
But this is before, and SciPy was trying to add plotting,
right?
It didn't have much success.
Really the success of plotting came from John Hunter.
We've had a similar experience to my experience,
my kind of Maverick experience as a person
just trying to get stuff done and kind of having more time than
Then money maybe right and John Hunter created what map plot live
That's he's a greater map. Yeah, so John Hunter was a you know
He was a student of the time
He was an actor. He was working in Quantfield and he said we need better plotting
So he just went out and said cool. I'll make a new project and we'll call it map plot live and he released in 2001
About the same time that sci-fi came out and it was separate library, separate install,
use numeric, SciPie use numeric.
And so SciPie, you know, 2001 released SciPie and then, and thought created a conference
called SciPie, which was brought people together to talk about the space.
Another conference is still ongoing.
It's one of the favorite conferences of a lot of people because it's It's changed over the years, but early on it was you know a collection of 50 people who care about
Scientists mostly you know practicing scientists who want to care about
Coating and doing it well and not using MATLAB or I remember being driven by I like MATLAB
But I didn't like the fact that
So I'm not a post-apriotic software. I'm actually not an open source
that, so I'm not a post-apriature software. I'm actually not an open source zealot. I love open source for what it brings, but I also see the role for proprietary software. But what I didn't
like was the fact that I would develop code and publish it, and then effectively telling somebody
here to run my code, you have to have this proprietary software. Right. And there's also culture on
MATLAB as much, because I've talked a few folks in, I get in math works, great, smart, I mean, there's just a culture,
they try really hard, but there's this corporate IBM style
culture that's like, or whatever,
I don't want to say negative things about IBM,
or whatever, but there's a...
No, it's really that connection.
It's something I'm in the middle of right now
is the business of open source.
And how do you connect the ethos of cooperative development
with the necessity of creating profits, right?
And like right now, today, you know,
I'm still in the middle of that.
That's actually the early days of me exploring this question.
Because I was writing sci-fi.
I mean, as an aside, I also had three kids at the time.
I have six kids now.
I got married early, wanted a family. I had three kids, and I remember reading, I read Richard kids at the time. I have six kids now. I got married early, wanted a family.
I had three kids and I remember reading, I read Richard Stallman's post and I was a fan of Stallman.
I would read his work.
I liked this collective ideas you would have.
Certainly the ideas on IP law,
it read a lot of stuff.
But then he said, okay,
how do I make money with this?
How do I make a living?
How do I pay for my kids?
All this stuff was in my mind.
Young graduate student making no money, thinking I gotta get a job. He said, well, I think
just be like me and don't have kids. That's just don't take on the set.
That was what he said in that moment. That's the thing I read and I went, okay, this is a train
I can't get on. There has to be a way to preserve the culture of open source and still be able to
make sufficient money to feed your here. Yes, exactly.
There's got to be, well, so that actually led me to a study of economics.
Because at the time, I was ignorant and it really was.
I'm actually, I'm embarrassed for educational system that they could let me, and I was
valictorian in my high school class and I did super well in college.
And like academically, I did great, right?
But the fact that I could do that and then be clueless about this key part of life, it
led me to go, there's a problem.
Like, I should, I should learn this in fifth grade.
I should learn this in eighth grade.
Like, everybody should come out with a basic knowledge of economics.
You're an interesting example because you've created tools that change the lives of probably
millions of people.
And the fact that you don't understand at the time of the creation of those tools,
the basic economics of how like to build up giant systems is the problem.
Yeah, it's a problem. And so during my PhD at the same time, this is Bexian 98, 99 at the same time.
I was in a library. I was reading books on capitalism. I was reading books on Marxism. I was reading books on,
you know, what is this thing? What does it mean? And I encountered a, basically,
what I encountered a set of writings from people that said they're the inheritors out
of Smith, but out of Smith for the first time, right? Which is the wealth of nations and
kind of this notion of emergent, emergent societies and realized, oh, there's this whole
world out here of people. And the challenge of economics is also political. Like, because economics, you know,
people, different parties running for office, they'll, they want their economic friends.
They want their economists to back them up, right, or to, to be there, to be their magicians.
Like the magicians in Pharaoh's court, right, the people that are going to say, hey,
this is usually listening to me because I've got the expert who says this. And so it gets really muddled, right?
But I was looking at it from a, as a scientist going, what is this space?
What does this mean?
How does Paris get fed?
How does, what is money?
How does it work?
And I found a lot of writings that I really loved.
I found some things that I really loved.
And I learned from that.
It was writings from people like Von Missess.
He wrote a paper in 1920 that still should be read more than it is.
It was the economic calculation problem of the socialist Commonwealth.
It was basically in response to the Bolshevik Revolution in 1917.
And his basic argument was, it's not going to work to not have private property.
You're not going to be able to come up with prices.
The bureaucrats aren't going to be able to determine how to allocate resources without a
price system. And a price system emerges from people making trades.
And they can only make trades if they have authority
over the thing they're trading.
And that creates information flow
that you just don't have if you try to top down it.
Right.
And it's like, huh, that's a really good point.
Yeah, the prices have a signal that's used.
And it's important to have that signal
when you're trying to build a
community of productive people like you would in the software engineering. Yeah, the prices are
actually an important signaling mechanism. Yeah, right. And that money is just a bartering tool.
Yeah. Right. So this is the first time I've encountered any of this concept, right? And the
fact that, oh, this is actually really critical. Like it's so critical to our prosperity and
actually really critical. Like it's so critical to our prosperity
and that we're dangerously not learning about this,
not teaching our children about this.
You know.
So you had the three kids,
you had to make some money, right?
I had to figure it out.
But I didn't really care.
I mean, I was never, I've never been driven.
My money just needed to eat.
So what, how did that resolve itself in terms of side by?
So I would say it didn't really resolve itself.
It sort of started a journey that I'm continuing on.
I'm still on, I would say.
I don't think it resolved itself, but I will say I went in
why I was wide open.
Like I knew that there were problems with, you know,
giving stuff away and creating the, the, the,
market in externalities, the fact that, yeah, people might use it.
And I might not get paid for it
and I'll have to figure something else out to get paid. Like at least I can say I'm not bitter
that a lot of people have used stuff that I've written and I haven't necessarily benefited
I can not really get from it. Like I've heard other people be bitter about that when they write or
they talk like, oh, I should have got more value out of this. And I'm also, I want to create systems
that let people like me who might have these desires to do things,
let them benefit so that actually creates more of the same.
Not to turn on your bitterness module,
but there's some aspect, I wish there was mechanisms for me
to reward whoever created sci-fi in Numpy,
because I brought so much joy to my life.
I appreciate that.
And I mean, the tip-jark notion was there.
I appreciate that.
But there should be a very, there should be a mechanism. There should be a mechanism. I totally that. I mean, the tip-jark notion was there. I appreciate that. But there should be a very...
There should be.
There should be.
My conditionless mechanism.
I totally agree.
I would love to talk about some of the ideas I have because I actually came across, I
think I've come up with some interesting notions that could work, but they'll require anything
that will work takes time to emerge.
Right.
Things don't just turn over.
There's definitely one thing I've also understood and learned.
Is any fixes?
That's why it's kind of funny. We often give credit to, you know, all this president gets elected and they'll look how great
things have done. And I saw that when I had a transition in a condo and you see, okay, man,
right? And it's like the success that's happening, there's an inertia there.
Yeah, right. And sometimes the decision made like 10 years before is the reason why the success
is see, right? Exactly. So we're sort of just one around taking credit for stuff. It's just that it's just that it's just that it's just that it's just that it's just that it's just that it's just that it's just that it's just that it's just that it's just that it's just that it's just that it's just that it's just that it's just that it's just that it's just that it's just that it's just that it's just that it's just that it's just that it's just that it's just that it's just that it's just that it's just that it's just that it's just that it's just that it's just that it's just that it's just that it's just that it's just that it's just that it's just that it's just that it's just that it's just that it's just that it's just that it's just that it's just that it's just that it's just that it's just that it's just that it's just that it's just that it's just that it's just that it's just that it's just that it's just that it's just that it's just that it's just that it's just that it's just that it's just that it's just that it's just that it's just that it's just that it's just that it's just that it's just that it's just that it's just that it's just that it's just that it's just that it's just that it's just that it's just that it's just that it's just that it's just that it's just that it's just that it's just that it's just that it's just that it's just that it's just that it's just that it's just that it's just that it's just that it's just that it's just that it's just that it's just that it's just that it's just that it's just that it's just that it's just that it's just that it's just that it's just that it's just that it's just that it's just that it's just that it's just that it's just that it's just that on. So I don't, I feel like I'm with you. Like I want the same thing. I want to be able to, and honestly not for personally, I've been happy. I've been, I've been happy. I feel like I don't
have any. I mean, we've been done reasonably okay, but I've had to pursue it. Like that's,
that's really what started my trajectory from academia. Is reading that stuff, let me say,
oh, entrepreneurship matters. I love software, but we need more entrepreneurs, and I want to understand that better.
Once I had that virus in fact, my brain, even though I was on a trajectory to go to a
tenure track position at university, and I was there for six years, I was kind of already
out the door when I started.
We can get into that.
What can I just ask a quick question on? Is there some design principles that were in your mind around sci-fi like it was there some key ideas that were just like sticking to you that
This is this is the fundamental ideas. Yeah, I would say so. I would think it's basically accessibility to scientists like give them
Give scientists and engineers tools that they don't have to think a lot about programming
So give them really good building blocks give them functions that they want to call, and
just the right length of spelling.
There's a one tradition in a programming where it's like, make very, very long names.
And you can see it in some programming languages where the names get, you know, take half
the screen.
And in the fortune world, characters were had be six six letters early on right and that's way too too much too too little but I was like I
liked to have names that were informative but short so even though Python was a different conversation but
documentation is doing some work there so when you look at great scientific libraries and functions, there's there's a richness
of documentation that helps you get into the details. The first glance at a function gives you
the intuition of all it needs to do by looking at the headers and so on, but to get the depths of
all the complexities involved, all the options involved, documentation does some work.
Documents is essential. Yeah, so that was actually a, so we thought about several things.
One is we wanted plotting.
We wanted interactive environment.
We wanted good documentation.
These were things we knew.
We wanted.
The reality is those took about 10 years to evolve.
Right.
Given the fact that we didn't have a big budget, it was all volunteer labor.
It was sort of, when NThought got created and they started to, you know, try to find
projects. People would pay for
pieces, and they were able to fund some of it.
Not nearly enough to keep up with what was necessary.
No criticism, just simply the reality.
It's hard to start a business and then do consulting and also promote open source project
that's still fairly new.
Cypher was fairly niche.
We stayed connected all while I was a student, sorry, a professor.
I went to be where you started to teach.
Electrical engineering, all the applied math courses.
I loved teaching, single processing, probability theory, natural magnetism.
I was like, if you look at right in my professor, which my kids love to do, I wasn't, I got
some bad reviews because people, it was a criticism.
I would speak to high, to high level.
Like I definitely had a calibration problem
coming out of graduate work
where I hate to be condescending to people.
Like I really have a ton of respect for people,
fundamentally, like my fundamental thing is
I respect people.
Sometimes that can lead to a,
I was thinking they were,
they had more knowledge than they did.
And so I would just speak to the very high level,
assume they got it.
But they need to rise to the standard that you set.
I mean, that's one of the greatest teachers do that.
And I agree, and that was kind of what was inspiring me.
But you also have to, I cannot say I was articulate
of some of the greatest teachers.
Right, I was, you know, like one class example,
and I first taught at BYU, my very first class, it was overheads, transparency? I was, you know, like one one class example when I first taught at BYU,
my very first class, it was overheads, transparencies overheads, before projectors, really that
common. I saw transparencies, I'm writing my notes out, I go in, rooms half dark, I just
blaring through these transparencies. Here it is, here it is, here it is, and I had to
give a quiz after two weeks, no, I knew anything. Nothing I taught, I got anywhere.
And I realized, okay, this is not working.
So I took, put away the transparencies
and I turned around and just started using the chalkboard.
And what it did is it slowed me down.
Right, the chalkboard just slowed me down
and gave people time to process and to think
and then that made me focus.
My writing wasn't great on the chalkboard,
but I really loved that part of the teaching.
So that entered SciPy's world in terms of,
we always understood that SciPy
there's a didactic aspect of SciPy.
Kind of how do you take the knowledge and then produce it?
The challenge we had was the scope.
Like, ultimately, SciPy was everything, right?
And so, 2001, when it first came out,
people were starting to use it.
No, this is cool, this is the tool we actually use.
At the same time, 2001 timeframe, there was a little bit of like the Hubble Space
Telescope, the folks at Hubble, it started to say, hey, Python, we're going to use Python
for processing images from Hubble. And so Perry Greenfield was a good friend and running
that program. And he had called me before I left to be, or you said, you know, we want
to do this. But numeric actually has some challenges in terms of, you know,
it's not the array doesn't have enough types.
We need more operations, you know, broadcast needs to be a little more settled.
They wanted record arrays.
They wanted, you know, record arrays are like a data frame, but a little bit different.
But they want a more structured data.
So he had called me even early on then and I said, you know, what, he would you want to work
on something to make this work?
And I said, yeah, I'm interested, but I'm going here.
And we'll see if I have time.
So in the meantime, while I was teaching, and Cypah was emerging, and I had a student, I
was constantly while I was teaching trying to figure a way to fund this stuff.
So I had a graduate student, my only graduate student, Chinese fellow, Liu Hong-Zen, is his
name, great guy.
He wrote a bunch of stuff for iterative, iterative linear algebra, like
got into writing some of the iterative liter algebra tools that are currently there in
SciPy and they've gotten better sense, but this in 2005 kept working on SciPy, but Perry
has started working on a replacement to an Amerit called numera. And in 2004, a package
called IndieImage, it was an image processing library that was written for
numerae. And it had in it a morphology tool. I don't know what morphology is, it's open,
dilations, there's sort of this, it's a medical imaging student, I knew what it was because it was
used in segmentation a lot. And in fact, I wanted to do something like that in Python and CypI,
but just to have never gotten around to it.
So when it came out that it worked only on a num- array and SciPy needed an numeric, and
so we effectively had the beginning of this split.
And numeric and num- array didn't share data.
They were just two.
So you could have a gigabyte of num- array data and a gigabyte of numeric data, and they
wouldn't share it.
And so you have these, then you have these scientific libraries written on top.
I got really bugged by that.
I got really like, oh man, this is not good.
We're not cooperating now.
We're not sort of redoing each other's work
and we're just young community.
So that's what led me, even though I knew it was risky
because my, I was on a tender track position, 2004
I got reviewed.
They said, hey, things are going okay.
You're doing well, papers coming out,
but you're kind of spent a lot of time
in this open source stuff.
Maybe you do a little less of that
and a little more of the paper writing and grant writing,
which was naive, but it was definitely the time,
the thinking that still goes on, still goes on.
You're basically creating a thing
which enables science in the 21st century.
Right.
Maybe don't emphasize that so much
and you're free or 10 years.
Right.
It illustrates some of the challenges.
Yes.
It does.
And it's people mean well.
Yeah.
Like, but we've gotten broken in a bunch of ways.
And certain things,
a programming, understanding the role
of software engineering programming.
Exactly.
Society is a little bit like.
Exactly.
Now, I was in an electrical engineering position.
Right. That's even worse. There. Exactly. Society is a little bit like. Exactly. Now, I was in electrical engineering position. Right. Which even worse. There. Yeah, it was
very, they were very focused. And so, you know, good people. And I had a great
time. I loved my time. I loved my teaching. I loved all the things I did
there. The problem was this split was happening. This community I loved. I
saw people and I go, oh my gosh, this is going to be, this is not great. And so I
happened, you know, fate. I had a class I had signed up for. It's a, I was trying to build an MRI system.
So I had a, kind of a radio, instead of a radio, a digital radio class, a digital MRI class.
And I had people sign up, two people signed up, then they dropped. And so I had nobody in
this class. So, and I didn't have any other courses to teach. And I thought, oh, I've got
some time. And I'll just oh, I've got some time.
And I'll just write a merger of the American numery.
Like, I'll basically take the numeric code base
at the features numery was adding
and then kind of come up with a single array
that everybody can use.
So that's where NumPy came from was my thinking,
hey, I can do this.
And who else is going to?
Because at that point, I'd been around the community
long enough and I'd written enough C code. I can do this. And who else is going to? Because at that point, I'd been around the community long enough. And I'd written enough C code.
I knew the structures.
And in fact, my first contribution to an America
had been writing the CAPI documentation
that went in the first documentation for NUMPAI, for Numeric.
Sorry, this is Paul DeWal, David Asher, Conrad Hinson
and myself.
I got credit because I wrote this chapter, which
is all the CAPI of Numeric, all the C stuff. So I said, I probably the one to do it., which is all the C API of New America, all
the C stuff.
So I said, I probably the one to do it.
Nobody else is going to do this.
So it's sort of out of a sense of duty and passion, knowing that I don't think my academic
I don't think the department here is going to appreciate this, but it's the right thing
to do.
So can we just link on that moment?
Yeah. The importance of the way you thought and the action you took, I feel is understated and
is rare and I would love to see so much more of it because what happens as the tools become
more popular, there's a split that happens.
And it's a truly heroic and impactful action to in those early in that early
split to step up. And it's like great leaders throughout history like get what is the brave heart
like get on the horse and and route the troops because I think that can have make a big difference.
We have TensorFlow versus PyTorch in the machine learning. We have the same problem today.
Yeah, it's a wonder.
It's actually bigger.
I wonder if it's possible in the early days to rally the troops.
It is possible, especially in the early days, the longer it goes, the harder, right?
The more energy in the factions, the harder.
But in the early days, it is possible, and it's extremely helpful.
And there's a willingness there, but the challenges,
there's just not a willingness to fund it.
There's not a willingness to, you know,
like I was literally walking into a field,
say I'm gonna do this,
and you know, here I am,
you know, I have five kids at home now.
Yeah.
Press your bills.
Sometimes my wife hears these stories,
and she's like, you did what?
I thought you were actually on a path to make sure we had resources and she's like, you did what? I thought you were actually
on a path to make sure we had resources and money. But again, there's an aspect, I'm a
very hopeful person, I'm an optimistic person in nature, I love people, I learned that about
myself later on, part of my religious beliefs actually lead to that. And that's why I hold
them dear because it's actually how I feel about.
That's what leads me to these attitudes,
sort of this hopefulness and this sense of,
yeah, it may not work out for me financially or maybe
but that's not the ultimate gain.
Like, that's a thing, but that's not the scorecard for me.
And so I just wanted to be helpful and I knew,
and partly because these sci-fi conferences,
because of the mailing list conversations
I knew there was a lot of need for this right and so I had this it wasn't like I was alone in terms of no feedback
I had these people who knew but it was crazy like people who stood at the time said yeah, we didn't think you'd be able to do it
Yeah, we thought it was crazy and also
instructive like practically speaking
That you had a cool feature that you were chasing in the morphology
like the, yes, it's not just end result.
It's not some visionary thing.
I'm going to unite the community.
You were like, you were actually practically, this is what one person actually could do and
actually build them.
Because that is important, because you can get over your skis.
You can definitely get over your skis.
And I had, in fact, this almost got me over my skis, right?
I would say, well, in retrospect, I hate looking back.
I can tell you all the flaws with NumPy, right?
We want to go into it.
There's lots of stuff that I'm like,
oh man, that's embarrassing.
That was wrong.
I wish I had somebody stop me with a wet fish there.
Like I needed, like what I'd wished I'd had was somebody
with more experience and certainly library writing and array library.
I wish I had me, I could go back into time and go do this,
do that, there's a more mean.
Because there's things we did that are still there
that are problematic, that created challenges for later.
And I didn't know it at the time,
didn't understand how important that was.
And in many cases didn't know what to do.
Like there was pieces of the design of Numpi,
I didn't know what to do until five years ago. Now I know what they should have been to have been, but I
didn't know at the time and nobody, and I couldn't get the help. Anyway, so I wrote it. It
took about, it took four months to write the first version, then about 14 months to make
it usable. But it wasn't, it was that first four months of intense writing, coding,
getting something out the door that worked,
that was definitely challenging.
And then the big thing idea was create a new type object
called D-Type.
That was probably the sync, the contribution.
And then the fact that I added not just broadcasting,
but advanced indexing, so that you could do
mask indexing and indirect indexing instead of just slicing.
And that's the fact. So for people who don't know, maybe you can elaborate. mask indexing and indirect indexing instead of just slicing in.
For people who don't know, maybe you can elaborate.
Yeah.
Numpy, I guess the vision in the narrowest sense is to have this object that represents
and dimensional arrays.
And like at any level of abstraction you want, but basically it could be a black box
that you can investigate in ways that you would naturally want to investigate.
Yes, such objects.
Yes, exactly. So you could do math on it easily.
Math on it easily.
So it had an associated library of math operations.
And effectively, SciPy became an even larger set of math operations.
So the key for me was I was going to write NumPy and then move SciPy to depend on NumPy.
In fact, early on, one of the initial proposals was that we would just write SciPy The key for me was, I was going to write numpy and then move scipy to depend on numpy.
In fact, early on, one of the initial proposals was that we would just write scipy and it
would have the numeric object inside it, and it would be scipy.array or something.
That turned out to be problematic because numeric already had a little mini library of linear
algebra and some functions, and it had enough momentum, enough users that nobody wanted
to, they wanted the backward compatibility.
One of the big challenges of NumPy
was I had to be backward compatible
with both numeric and numery in order to allow
both of those communities to come together.
There was a ton of work in creating
that backward compatibility that also created echoes
in today's object, like some of the complexity
in today's object is actually from that goal
of backward compatibility of these other communities, which if you didn't have that, you'd do something different,
which is instructive because a lot of things are there, you know, I think what is that
there for?
It's like, well, it was a, it's a, it's an artifact of, of its historical existence.
By the way, I love the empathy and the lack of ego behind that because I feel you see that in the split in the JavaScript frameworks, for example, the arbitrary branching, right is.
I think in order to unite people, you have to kind of put your ego aside and truly listen to others like, what do you love about number eight? What do you love about numeric?
Like actually get a sense. We're talking about languages earlier, sort of empathize to the culture of the people that love something about this particular
API, some, some, the naming style or the, the, the, the, the, the use, they actually
usage patterns and like truly understand them. And so that you can like create that same
draw in the, in the United, completely agree. And you have to also have enough passion that you'll do it.
It can't be just like a perfunctory, oh yeah, I'll listen to you
and then I'm not really excited about it.
So it really is an aspect.
It's a philosophical, like there's a philia.
There's a love of esteeming of others.
It's actually at the heart of what,
it's one of a life philosophy for me, right?
That I'm constantly pursuing, and
that helped, absolutely helped.
Makes me wonder in a philosophical, like looking at human civilization as one object, it
makes me wonder how we can copy and paste travisism as well.
Well, in the, in the, some aspects, maybe.
Some aspects, right, right.
Exactly.
Well, I, it's a good question.
How do we teach this?
How do we, how do we encourage it?
How we lift it? Because so much of the software world, it's it's giant communities,
right? But it seems like so much is moved by like little individuals. You look, you talk
about like Linus Torvald. It's like, can you could you have not could you have had Linux
without him? Could you? It's like, Gito and Python. We get on Python. Python. I mean,
the side pack community, particularly, it's like I said, we wanted to build this big
thing, but ultimately we didn't.
What happened is we had Mavericks and champions like John Hunt regretted Matt Plotlib.
We had Fernando Perez who created IPython.
And so we sort of inspired each other, but we, and then it created, there's sort of a
culture of, of this selfless, give the stewardship mentality as opposed to ownership mentality, a stewardship and community focused, community focused but intentional work.
Like not waiting for everybody else to do the work, but you're doing it for the benefit
of others and not worried about what you're going to get.
You know, you're not worried about the credit, you're not worried about what you're going
to get, you're worried about.
I later realized that I have to worry a little about credit, not because I want the
credit, because I want people to understand what led
to the results. It's not about me. I want to understand this is what led to the result.
So, I think doing, and this is what had no impact on the result. Let's promote, it's
just like you said, I want to promote the attributes that help make us better off.
How do we make more of West McKinney? Like West McKinney was critical to the success of Python
because of his creation of pandas,
which is the roots of that were all the way back
in America nummer A and numpy,
where numpy created an array of records.
West started to use that almost like a data frame,
except it's an array of records.
And data frame, the challenge is,
okay, if you want to augment it,
add another column.
You have to insert, you have to do all those memory
movements when it's sort of column.
Whereas data frames became, oh, I'm going to have
a loose collection of arrays.
So it's a record of arrays that is the heart of a data
frame. And we thought about that back in the memory days,
but Wes ended up doing the work to build it,
and then also the operations that were relevant for data processing.
What I noticed is just that each of these little things creates just another tick, another up.
So NumPy ultimately took a little while, about six months in, people started joining me.
You know, Francesque, Alted, Robert Kern, Charles Harris.
And these people are many of the unsung heroes, I would say. People who are,
you know, they don't, they sometimes don't get the credit they deserve because they were critical
both to support. Like, you know, it's hard and you want, you need some support, people need support.
And I needed just encouragement and they were helping to encourage by contributing. And, and once,
the big thing for me was when John Hunter, he had previously done kind of a simple thing
called an emerix to kind of move between the American number A, get a little high level
tool that would just select each one from Outplotlib.
In 2006, he finally said, we're going to just make numpy dependency of Outplotlib.
As soon as he did that, and I remember specifically when he did that, I said, okay, we've done
it.
That was when I knew we had to see success.
And before then, it was still, you know,
didn't do insurer, but that kind of sort of roller coaster
and then 2006 to 2009.
And then I've been floored by what it's done.
Like I knew it would help.
I didn't know idea how much it would help, right?
So.
And it has to do with, again, the language thing.
It just, people started to think in terms of Numpy like yes
And that opened up a whole new way of thinking and part of the story that we cut you kind of mentioned, but
maybe you can
elaborate is it seems like at some point in this story
Python took over science and data science
and, not bigger than that, the scientific community
started to think like programmers
or started to utilize the tools of computers to do,
like at a scale that wasn't done with Fortran,
like at this gigantic scale,
they started to opening their heart and then Python was the thing.
I mean, there's a few other competitors, I guess, but Python, I think, really, really took over.
I agree. There's a lot of stories here that are kind of during this journey because this is sort
of the start of this journey in 2005, 2006. So my tenure committee, I applied for tenure in 2006,
2007. It came back, I split the department. I was
very polarizing. I had some huge fans and then some people said no way, right? So it was
very, I was a polarizing figure in the department. It went all the way up to the university president.
Ultimately, or my department chair had the had this way and they didn't say no. They said,
come back in two years and do it again. And I went, at that point, I was like, I said, I, I, I, I, I had this interest
in entrepreneurship, this interest in, in not the academic circles, not the, like, how
do we make industry work? So I do have to give credit to that, that exploration of economics
because that led me, oh, I had a lot of opinions. I was, I was, I was actually very libertarian
at the time. And I'm still have some libertarian trends, but I'm more of a, I'm more of a collectivist
libertarian. I see you value broadly, philosophically freedom.
A value broadly philosophically freedom, but I'll understand the power of communities,
like the power of collective behavior. And so what's that balance, right? That makes
sense. So, but the time I was just, I got to go out and explore this entrepreneur world.
So I left academia.
I said, no thanks.
Called my friend Eric here, who his company was going.
I said, hey, could I join you and start this trend?
And at that time, they were using side by a lot.
They were trying to get clients.
So I came down to Texas.
And in Texas, where I sort of, it's my entrepreneur world, right? I left academia and went to entrepreneur
world in 2007. So I moved here in 2007, kind of took a leap. You know, nothing really about
business. You know, nothing about a lot of stuff there. There's, you know, for a long
time, I've kept some connections to a lot of academics because I still value it. I still love the scientific tradition. I
still value the essence and the soul and the heart of what is possible. Don't
like a lot of the administration and the kind of we can go into detail about
why and where and how this happens. What are the challenges? I mean, I don't
know, but I'm with you. So I'm still with the MIT.
I still love MIT because there's magic there. Yeah, there's people I talk to, like researchers,
faculty, in those conversations and the white board and just the conversation. That's magic there.
All the other stuff, the administration, all that kind of stuff, seems to, you don't
want to say too harshly criticized sort of bureaucracies, but there's a lag that seems
to get in the way of the magic. And I don't, I'm still have a lot of hope that that can
change because I don't often see that particular type of magic elsewhere in the
industry.
We need that, and we need that flame going.
It's the same thing as exact as you said.
It has the same kind of elements like the open source community does.
The reason I stepped away, the reason I'm here, just like you did in Austin, is like,
if I want to build one robot, I'll stay at MIT.
But if I want to build millions and make money enough to work and explore the magic of
that, then you can't.
And I think that dance is-
Now, translational dance has been lost a bit.
Right?
And there's a lot of reasons for that.
I'm certainly not an expert on this stuff. I can opine like anybody else, but I
realized that I wanted to explore entrepreneurship, which I and really figure out and it's been a driving passion for 20 years, 20,
25 years. How do we connect
capital markets and company?
Because again, I fell in love with the notion, oh, profit seeking on its own is not a bad thing.
It's actually a coordination mechanism for allocating resources that, you know, non in
an emergent way, right?
The respects everybody's opinions, right?
So this is actually powerful.
So, so I say all the time when I make a company and we do something that makes profit,
what we're saying is, hey, we're collecting of the world's resources and voluntarily,
people are asking us to do something they like.
And that's a huge deal.
And so I really liked that energy.
So that's why I came to do and to learn
and to try to figure out.
And that's what I've been kind of stumbling through
since for the past 14 years.
2007.
2007.
So you were still.
So no problem.
Just emerging.
Just emerging.
I, one thing I've done, I've done,
it's worth mentioning because it emphasized the exploratory nature
of my thinking at the time.
I said, well, I don't have a fundus thing.
I've got a graduate student I'm paying for
and I got no funding for him.
And I had done some fundraising
from the public to try to get public fundraiser in my lab.
I didn't really want to go out
and just do the fundraising circuit,
the way it's traditionally done.
So I wrote a book and I said, I'm gonna write a book
and I'm gonna charge for it. It was called Guide to NumPy. And so ultimately NumPy became
documentation driven development because I basically wrote the book and made sure the
stuff worked, the book would work. So it really helped actually make NumPy become a thing.
So writing that book and it was not a, it was not a page churner. I mean, kind of NumPy
is not a book you pick up and go, oh, this is great over the fire.
But it was, it's where you could find the details,
like how do all this work?
And a lot of people love that book.
And so a lot of people ended up,
so I, when I said, look, I need to,
so I'm gonna charge for it.
And I got some flak for that, not that much.
Just, just probably five angry messages,
people's, you know, yelling at me,
saying I was, you know, bad guy for charging for this book.
It's one of them which is Tom.
No, I haven't really had any interaction with him personally, like I said.
But there were a few, but actually surprisingly not.
There were actually a lot of people like, no, it's fine.
You can charge for a book.
That's no big deal.
We know that's the way you can try to make money around open source.
What I did, but I did an interesting way.
I said, well, my idea is around IP law and stuff.
I love the idea.
You can share something and you can spread it.
The fact that you have a thing and copying is free,
but the creation is not free.
So how do you fund the creation and allow the copying?
And then software is a little more complicated than that,
because creation is actually a continuous thing.
That's how you build a widget that's done.
It's sort of a process of emerging and continuing to create.
But I wrote the book and had this market-determined price thing.
I said, look, I need, I think I said 250,000.
If I make 250,000 from this book, it'll make it free.
So as soon as I get that much money, or I said five years, right?
So there's a time limit.
That's true.
That's cool. I didn't know the story. Yeah, so I said five years, right? So there's a time limit. That's true. That's a great one. I didn't know the story.
Yeah, so I released it on this.
And it's actually interesting because one of the people who also thought that was interesting
ended up being Chris White, who was the director of DARPA project.
We got funding through it on a condo.
And the reason he even called us back is because he remembered my name from this book.
And he thought that was interesting.
Yeah.
And so even though we hadn't gone to the demo days,
we applied and the people said,
yeah, nobody ever gets this without coming to the demo day first.
This is the first time I've seen it.
But it's because I knew Chris had done this,
had this interaction.
So it did have impact.
I was actually really pleased by the result.
I ended up in three years, I mean, 90,000.
So sold 30,000 copies.
By myself, I just put it up on,
you know, use PayPal and sold it.
And those are my first taste of kind of,
okay, this can work to some degree.
And I, you know, all over the world, right,
from Germany to Japan to, it was actually it did work.
And so I appreciated the fact that PayPal existed
and had a way to make, to get the money
to distribution was simple.
This is pre-Amazon book stuff,
so it was just publishing a website.
It was the popularity of SciPy emerging
and getting company usage.
I ended up not letting it go the five years
and not trying to make the full amount
because a year and a half later I was at NThought.
I had left Academia as an NThought
and I kind of had a full-time job.
And then actually what happened
is the documentation people, there's a group that said,
hey, we want to do a documentation for SciPy as a collective.
They were essentially needing the stuff in the book.
And so they kind of asked, hey, could we just use the stuff for your book?
And at that point, I said, yeah, I'll just open it up.
But it has served its purpose.
And the money that I made actually funded my grad student,
like it was actually, you know,
I paid him 25,000 a year out of that money.
So the point of thing is if you do
very similar kind of experiment now with Numpi
or something like it, you could probably make a lot more.
That's probably true.
Because of the tooling and the community building.
Yeah, I agree.
Like the end social media,
that there's just a virology to that kind of idea.
I agree, there'd be things to do.
I've thought about that.
And really, I've thought about a couple of books
or a couple of things that could be done there.
And I just haven't, right?
Even I tried to hire a ghost rider this year too
to speak if I could help, but it didn't.
Like part of my problem is,
I've been so excited by a number of things
that stemmed from that.
Like, so I came here, worked at NThought for four years.
Graciously, Eric made me president, and we started to work closely together.
We actually helped them buy out his partner.
It didn't end great.
Like, unfortunately, Eric and I aren't real friends now.
I still respect him.
I have a lot, I wish we were, but he didn't like the fact that I, that Peter and I
started on a condo, right? That was not, I mean, so I'm, there's two sides of that story,
so I'm not going to go into it, right? Sure. But you, human beings, and you wish you still
could be friends. I do. I do. It saddens me. I mean, that's, that's a story of great minds building great companies.
Yeah, somehow it's sad that when there's that kind of...
And I hold him in a steam, I'm grateful for him.
I think he's there doing...
And there's end thoughts to exist, they're doing great work.
Helping scientists, they still run the SciPy conference.
They're in the... they have an R&D platform.
They're selling now that's a tool that you can go get today, right?
So they've been as... They're in the, they have an R&D platform. They're selling now that's a tool that you can go get today, right?
So they've been,
a N-thought has played a role in the sci-fi
in supporting the community around sci-fi, I would say.
They ended up not being able to,
they ended up building a tool suite to write GUI applications.
Like that's where they could actually make
that the business could work.
And so the supporting sci-fi and NumPy itself
wasn't as possible. Like they didn't, they tried. I mean, it was not just because, it was just because the business could work. And so the supporting sci-fi and numpy itself wasn't as possible.
Like they didn't they tried. I mean it was not just because it was just because the business aspect.
So and then I wanted to mill a company that could do that could get venture funding.
Right. Better for worse. I mean that's a longer story. We could talk a lot about that.
And that's that's why I'm an economy game. That's where an economy came to be.
So let me let me ask you it's a little bit for fun because you built this amazing thing.
So let's talk about an old warrior looking over old battles.
There's a sad letter in 2012 that you wrote to the Numpi mailing list, the Nazi that
you're leaving Numpi, and some of the things you've listed, and some of the things you regret,
or not regret necessarily,
but some things to think about.
If you could go back and you could fix stuff about NumPy,
or both sort of in a personal level,
but also like looking forward,
what kind of things would you like to see changed?
Good questions.
So I think there's technical questions
and social questions right there.
First of all, you know, I wrote NumPy as a service and I spent a lot of time doing it.
And then other people came help make it happen.
NumPy succeeded because the work of a lot of people, right.
So it's important to be able to understand that.
I'm grateful for the option.
I had the role I could play and grateful that things I did had an impact,
but they only had the impact they had because the other people that came to the story
and so they were essential.
But the way data types were handled,
the way data types we had Array Scalers, for example,
that are really just a substitute for a type concept.
Right, so we had Array Scalers are actual Python objects
so that there's for every for a 32-bit float
or a 16-bit float or a 16-bit integer
Python doesn't have a natural, it's just that one integer, is one float. Well, what about these
lower precision types and these larger precision types? So we had them in NumPy
so that you could have a collection of them, but then have an object in Python that was one of them.
Mm-hmm. And there's questions about, in retrospect,
I wouldn't have created those of it
improve the type system.
Made the type system actually a Python type system
as opposed to currently, it's a Python 1 level type system.
I don't know if you know the difference
from Python 1, Python 2.
It's kind of technical, kind of depth.
But Python 2, one of its big things that Gido did,
it was really brilliant, it was actually Python 2, one of its big things that Gito did, it was really brilliant. It was actually Python 1, all classes, new objects, were one.
So he was a user, wrote a class.
It was an instance of a single Python type called the class type.
In Python 2, he used a meta typing hook to actually go, oh, we can extend this and
have users write classes that are new types.
So it's able to have your user classes be actual types.
And the Python type system got a lot more rich.
I barely understood that.
The time that NumPy was written.
And so I essentially in Python and NumPy
created a type system that was Python 1 era.
It was every D type is an instance of the same type
as opposed to having new D typestypes be really just Python types
with additional metadata.
What's the cost of that?
Is it efficiency?
Is it usability?
It's usability, primarily.
The cost isn't really efficiency.
It's the fact that it's clumsy to create new types.
It's hard.
And then one of the challenges is you
want to create new types.
You want to quaternion type, or you want
to add a new, you know,
positive type, or you want to,
so it's hard.
Now, and now, if we had done that well,
when Numba came on the scene,
where we could actually compile Python code,
it would integrate with that type system much cleaner.
And now all of a sudden,
you could do gradual typing more easily.
You could actually have Python when you add number plus better typing.
Could actually be a, uh, you'd smooth out a lot of rough edges.
But there's already, there's like, you, but are you talking about from the
perspective of developers within NumPy or users of NumPy?
Because developers of new, not, not really users of NumPy so much.
It's the development of NumPy.
I see you're thinking about like how to design NumPy so much, it's the development of NumPy. So you're thinking about like,
how to design NumPy so that it's contributors.
Yeah, the contributors are, it's easier.
It's less work to make it better and to keep it maintained.
And where that's impacted things, for example,
is the GPU, like all of a sudden GPU is starting at it.
And we don't have them in NumPy.
Like NumPy should just work on GPUs.
The fact that we have to have to download a whole other object
called Kupai to have arrays on GPUs
is just an artifact of history.
Like there's no fundamental reason for it.
Well, that's really interesting.
If we could sort of go on that tangent briefly,
is you have Pi torch and other library,
like TensorFlow that basically tried to mimic Nupi, like you've created a sort of
platonic form of what I'm talking about. Yeah, exactly. Well, the problem was I didn't realize that.
Yeah. The platonic form has a lot of edges there, right? We should cut those out before we present it.
So I wonder if you can comment, is there like a difference between their implementations? Do you wish that they were all using numpy or like in the subtractional GPU?
And sorry to interrupt it.
There's GPUs, ASICs, there might be other neuromorphic computing.
There might be other kind of, or the aliens will come with a new kind of computer.
Like an abstraction that numpy should just operate nicely over the things that are more
and more and smarter and smarter
with this multi-dimensional arrays. Yeah, yeah. I have several comments there. We are working on
something now called data.apis.org, data.api.org. You can go there today. And it's our answer. It's
my answer, you know, it's not just me, it's me and R and Athen and Aaron and a lot of companies are helping
us at Quanci Labs.
It's not unifying all the arrays, it's creating an API that is unified.
So we do care about this and trying to work through it.
Actually, the chance to go and meet with the TensorFlow team and the PyTorch team and
talk to them after X and Anaconda.
Just talking about, because the first year after leaving Anaconda in 2018,
I became deeply aware of this and realized that, oh, this split in the
array community that exists today makes what I was concerned about in 2005
pretty perocule. It's a lot worse.
Now, there's a lot more people, so perhaps the industry can sustain more
stacks, there's a lot of money, so perhaps the industry can sustain more stacks, right?
There's a lot of money, but it makes it a lot less efficient.
I mean, this, but I've also learned to appreciate, it's okay to have some competition.
It's okay to have different implementations, but it's better if you can at least refactor
some parts.
I mean, you're going to have more efficient if you can refactor parts.
It's nice to have competition over things.
Over with. Nice to have competition over things. Overwares.
Nice to have competition.
They're innovative.
Yeah, innovative.
And then maybe on the infrastructure, whatever, however, you define infrastructure.
Right.
Maybe it's nice to have competition.
Converction.
Exactly.
I agree.
And I think, but it was interesting to hear the stories.
I mean, TensorFlow came out of the C++ library, Jeff Dean wrote, I think, that was basically how
they were doing inference, right?
And then they realized, oh, we could do this TensorFlow thing.
That C++ library, then what was interesting to me was the fact that both Google and Facebook
did not, it's not like they supported Python or NumPy initially, they just realized they
had to.
They came to this world and then all these were like, hey, where's the NumPy interface?
Oh, and they kind of came late to it
and then they had these bolt-ons.
TensorFlow's bolt-on, I don't mean to offend,
but it was so bad.
It was the first time that I'm usually,
I mean, one of the challenge I have is I don't criticize enough.
In the sense that I don't give people input enough.
I think it's
university agreed upon the bolt-ons and tens of foot.
Great. But I went to a talk given at a pi- at my orca in Spain and a guy, great guy,
he came and gave a talk. I said, you should never show that I pay again at a pi data conference.
Like, that was, that, that's terrible. Like, you're taking this beautiful system you've
created and that you're corrupting all these poor Python people For us to tend to write code like that or thinking they should
Fortunately, you know, they adopted Keras as their and that's the carous as better and so Keras TensorFlow is fine is reasonable
But they bolted it on
Facebook did too like Facebook had their own
C++ library for doing inference and they also have the same reaction they have to do this.
One big difference is Facebook, maybe because of the way it's situated in part of fair,
part of their research library, TensorFlow is definitely used and they have to make, they
couldn't just open it up and let the community change what that is because I guess they
were worried about disrupting their operations.
Facebook's been much more open to having community input on the structure itself,
whereas Google and TensorFlow, they're really eager to have user-community users. People use it and
build the infrastructure, but it's much more wild. It's harder to become a contributor to TensorFlow.
And it's also, this is very difficult question to answer, and don't need to be thrown
shade at anybody, but you have to wonder, it's the Microsoft question,
and don't need to be thrown shade at anybody, but you have to wonder, it's the Microsoft question of when you have a tool like pie torch or tons of flow, how much are you tending
to the hackers and how much are you tending to the big corporate clients?
Correct.
And so like the ones, do you tend to the millions of people that are giving you almost no
money or do you tend to the people, the few that are giving you a ton of money, I tend
to stand with the people.
Right.
Because I feel like if you nurture the hackers, you will make the right decisions in the
long term that will make the companies happy.
I lean that way too.
Totally.
But then you have to find the right data.
But it's a balance.
Yeah.
Because you can lean to the hackers around our money.
Yeah, exactly.
Exactly. Which has been some of the hackers and run out of money. Yeah, exactly.
Exactly.
Which has been some of the challenge I faced in the sense that like I,
like I would look at some of the experiments like NumPy,
the fact that we have this split is a factor of I wasn't able to collect more money towards
NumPy development.
Right.
I mean, it didn't succeed at the early days of getting enough financial contribution
to NumPy so they didn't be good work on it.
I couldn't work on it full time.
I had to just catch an hour here and hour there.
And I basically not like that.
Like I've wanted to be able to do something about that
for a long time and try to figure out how well there's
lots of ways, I mean possibly one could say,
you know, we had an offer from Microsoft
early days of Anaconda, the 2014,
the offer to come by us, right?
The problem was the right people
at Art Microsoft didn't offer to buy us.
And they were still, it was really,
we were like a second, they had really bought,
they just bought our, our company called,
it was not our studio,
but it was another art company that was emergent.
And it was kind of a, well, we should also get a Python play,
but they were really double it down in R, right? And so it was kind of a, well, we should also get a Python play, but they were really double it down in R, right?
And so it was like,
it was where you would go to die.
So it wasn't before Sasha was there.
Sasha had just started.
Just started.
And the offer was coming from someone
to levels down from him.
Got you.
And if it had come from Scott Guthrie,
so I got a chance to meet Scott Guthrie,
great guy, I like him.
If it had offered to come from him, probably would be at Microsoft right now.
That'd be fascinating. That would be really nice actually, especially given what Microsoft
has since done for the source community. Yes, I think they're doing well. I really like
some of the stuff they've been doing. They're still working and they've hired Gido now
and they've hired a lot of Python at first. Wait, Gido's not much stuff. I need to...
He retired then he came out of retirement and he's working out.
So he was just talking to him and he didn't mention this part.
Well, I should have, but I get this further.
Well, I know he left Dropbox, but I wasn't sure what he was doing, what he was up to.
Well, he was kind of saying he would retire a bit.
And it's literally been five years since I last set out
and really talked to Gido.
Gido is a technology expert.
So I came out with excited because I finally
figured out the type system for NumPy.
I wanted to kind of talk about that with him
and I kind of overwhelmed him.
Could you stay in that moment just for a brief moment?
Because you're a fascinating person,
the history of programming.
He is a fascinating person.
What have you learned from Guido about programming about life?
Yeah, yeah. A lot actually. I've been a fan of Guido's. We have a chance to talk.
Some, I wouldn't say we talk all the time, not only at all. He may.
But we talked enough to, I respect his, in fact, when I first started Nump,
one of the first things I did was I asked Gido for a meeting with him in Paul de
Bois in San Mateo. And I went and met him for lunch and basically to say, maybe we can actually
part of the strategy for Numpy was to get it into Python 3 and maybe part of Python.
And so we talked about that. That's cool. And about that approach, right?
I would have loved to be a fan of the one a fine. That was that was good and over the years for Gido I learned
So he was open like he was willing to listen to people's ideas
Right, and over the years now generally, you know, I'm not saying universally that's been true
But but generally it's been true. So he's willing to listen. He's willing to defer
Like on the scientific side. He would just kind of defer. He didn't really always understand what we were doing.
Like, and he defer.
One place where he didn't enough was we missed a matrix
multiply operator.
Like that finally got added to Python,
but about 10 years later, that it should have.
But the reason was because nobody,
it takes a lot of effort.
And I learned this while I was writing NumPy,
I also wrote tools to, if you can have Python, Devon, I added some pieces to Python.
Like the memory view object.
I wanted it to the structure of NumPy into Python.
So we didn't get NumPy into Python, but we got the basic structure of it in the Python.
So you could build on it.
Nobody did for a while, but eventually,
he database author started to.
And it's a lot better they did.
And also Antoine Petro and Stefan Kraw actually fixed the memory view object because I wrote the underlying infrastructure in C but the Python exposure
was terrible until they came in and fixed it partly because I was already numpy and numpy
was the Python exposure.
I didn't really care about if you didn't have numpy installed.
Anyway, Gito opened up ideas, technology, brilliant.
I really got a lot of respect for him when I saw what he did with this type class merger thing.
It was actually tricky.
Then willing to share, willing to share his ideas.
The other thing early on in 1998, I said I start wrote my first extension module.
The reason I could is because he wrote in this blog post on how to do reference counting.
Without it, I would have been lost, right? But he was
willing to at least try to write this post. And so he's been motivated, he's been motivated
early on with Python, there's a computer science for everybody, we're going to have this early
on desire to, oh, maybe we should be pushing programming to more people. So he had this
populous notion, I guess, or populous sense. So learn that there's a certain skill, and
I've seen it in other people too of engaging
with contributors sufficiently to because when somebody engaged with you and wants to contribute
to you, if you ignore them, they go away. So building that early contributor base requires
real engagement with other people. And he would do that.
Can you also comment on this tragic stepping down from his position as the benevolent dictator for life over the wars
You know the wall or operator the wall or operator was the bet last battle
I don't know if that's the cause of it
But this there's this for people who don't know you can look up
There's the wall or operator which is looks like a colon and equal sign. Yeah, colon equal sign.
And it actually does maybe the thing that you then equal sign should be doing.
Yeah.
Maybe right exactly.
Yeah.
But it's just historically,
it equals sign means something else.
It just means assignment.
So keep step down over this.
What do you think about the pressure of leadership?
I did. It's some of that you mentioned the letter I wrote
in Empire of the Times.
That was a hard time actually.
I mean, you know, there's been really hard times.
It was hard, you know, you get criticized, right?
And you get pushed and you get,
not everybody loves what you do.
Like anytime you do anything that has impact at all,
you're not universally loved, right?
You get some real critics.
And that's an important energy it has impact at all. You're not universally loved, right? You get some real critics. And
that's an important energy because it's impossible for you to everything right. You need people
to be pushing. But sometimes people can get mean, right? People can, I prefer to give
people the benefit of the doubt. I don't immediately assume they have bad intentions. And maybe
for other, you know, maybe other, maybe that doesn't happen for everybody. They, for whatever
reason, their past, their experience of people, they, they sometimes
have bad ends.
They, they, they, so they immediately attribute to you bad intentions.
They're like, what are this come from?
I mean, I definitely open to criticism, but I think you're misinterpreting the whole point.
Because I, I would get that, you know, sort of when I started Anaconda, you know, I,
I've been, sometimes I say to people, I know I'm, I care enough about entrepreneurship
to make some open source people uncomfortable.
And I care enough about open source to make investors uncomfortable.
So I sort of create, you create kind of doubters on both sides.
So we have, and this is just a plea to the listener and the public.
I've noticed this too, that there's a tendency in social media makes this worse.
When you don't have perfect information about the situation, you tend to fill the gaps with the
worst possible, or at least a bad story that fills those gaps. And I think it's good to live life,
maybe not fully naively, but filling in the gaps with the good, with the best,
with the positive, with the hopeful explanation of why you see this.
So if you see somebody like you trying to make money in a book about NumPy, there's
a million stories around that that are positive, and those are good to think about, to project
positive intent out of people, because for many reasons, usually because people are good to think about, to project positive intent on the people because for many reasons,
usually because people good and they do have good intent. And also when you project that positive
intent, people step up to that. Yes. So like it has this point. It has this kind of viral nature to it.
And of course, what Twitter, early on figured out on Facebook is that they can make a lot of money and engagement from the negative
Yes, and so like there's this we're fighting this mechanism. I'm just challenging. It's like easier
It's just easier to be to be negative and then for some reason something in our mind
Really enjoys sharing that and getting getting all excited about the negativity we do
Yeah, but but some protective mechanism perhaps that we're worried we're going to eat and
if we don't.
Exactly.
It's hard to be effective as a group of people in a software engineering project.
You have to project positive intent, I think.
I totally agree.
Totally agree.
And I think that's very...
And so that happens in this in the space.
But Python has done a reasonable job in the past, but here is a situation where I think
it started to get this pressure where it didn't.
I really didn't know enough about what happened.
I've talked to several people about it.
I know most of the steering committee members today,
one person nominated me for that role,
but it's the wrong role for me right now.
I have a lot of respect for the Python developer space
and the Python developers.
I also understand the gap between computer science,
Python developers and array programming developers are science developers. And in fact, Python developers, and array programming developers, or science developers.
In fact, Python succeeds in the array space, the more it has people in that boundary.
There's often very few.
I was playing a role in that boundary and working like everything to try to keep up with
even what Gita was saying.
I'm a C programmer, but not a computer scientist.
I was an engineer and physicist and mathematician,
and I didn't always understand what they were talking about,
and why they would have opinions the way they did.
So, you know, you have to listen and try to understand,
and then you also have to explain your point of view
in a way they can understand.
And that takes a lot of work, and that communication
is always the challenge.
And it's just what we're describing here about the negativity is just another form of that.
Like how do we come together?
And it does appear we're wired anyway
to at least have a, there's a part of a,
so the enemy, you know, friend enemy.
And, and we see, yeah, it's like,
why don't we wire it on the enemy front?
Yeah.
So, so why are we pushing that?
Why are we promoting that so deeply?
Assume friend and to prove another way.
Yes.
Yes. So, cause you have such a fascinating mind andume friend until proven otherwise. Yeah. Yeah.
So, because you have such a fascinating mind and all,
let me just ask you these questions.
So, one interesting side on the Python history
is the move from Python 2 to Python 3.
You mentioned move from Python 1 to Python 2,
but the move from Python 2 to Python 3
is a little bit interesting because it took a very long time.
It broke in quite a small way backward compatibility, but even that small way seemed to have been
very painful for people.
Is there lessons you draw?
Oh, there are tons of lessons from how long it took and how painful it seemed to be.
Yeah, tons of lessons.
Well, I mentioned here earlier that num num pi was written in 2005.
It was in 2005 that I actually went to Gido to talk about getting num pi into Python 3. Like,
my strategy was to, oh, we were moving to Python 3. Let's have that be. And it seems funny
in retrospect because like, wait, Python 3, that was in 2020, right? We're finally
ended to support for Python 2 or at least 2017. The reason it took a long time,
a lot of time, I think it's because one of the things is there wasn't much to like about Python 3.0,
3.1. It really wasn't until 3.3. I consider Python 3.3 to be Python 3.0. I wasn't until Python 3.3 that
I felt there was enough stuff in it to make it worth anybody using it, right? And then three, four started to be,
oh, yeah, I want that. And then three, five as the matrix will play operator. And now it's like, okay,
we got to use that. Plus the libraries that start leveraging some of the features of Python. Exactly.
Yeah. So it really, the challenge was it was, but it also illustrated a truism that, you know,
it's, when you have a nurse show when you have a group of people using something,
it's really hard to move them away from it.
You can't just change the world on them.
And Python 3, you know, made some,
I think it fixed some things Gido would always hate it.
I don't think he didn't like the fact
that print was a statement.
He wanted to make it a function.
But in some sense, that's a bit of gratuitous change
to the language and you could argue.
And there are people have.
But one of the challenges was that it wasn't
enough features and too many just changes without features. And so the empathy for the end
user as to why they would switch wasn't wasn't there. I think also it illustrated just the
funding realities. Like Python wasn't funded. Like it was also a project with a bunch of
volunteer labor, right? It had more people, so more volunteer labor, but it was still,
it was fun to the sense that at least Gido had a job.
And I learned some of the behind the scenes on that now,
since it's talking to people who are lived through it.
And maybe not on air, we can talk about some of that.
But it's interesting to see, but Gido had a job, but he,
but his full time job wasn't just work on Python.
Yeah.
Like he had other things to do.
Just wild.
It is wild, isn't it?
As wild how few people are funded.
Yes.
How much impact they have.
Yes.
Maybe that's a feature in our bug.
I don't know.
Maybe, yes, exactly.
At least early on.
It's sort of, I know.
Yeah.
It's like Olympic athletes are often severely underfunded,
but maybe that's what brings out the greatness.
Perhaps.
Yes, correct.
No, exactly.
Maybe this is a essential part of it.
Because I do think about that in terms of,
I currently have an incubator for open source startups.
What I'm trying to do right now is create the environment.
I wish to existed when I was leaving academia with Numpi
and trying to figure out what to do.
I'm trying to create those opportunities and environments.
And that's what drives me still.
How do I make the world easier for the open source entrepreneur?
Uh, so let me stay. I mean, I could probably stand up for a long time, but, um, this is fun.
Question. So, Andre Kapati leads the Tesla autopilot team and, uh, he's also one of the most like legit.
He's also one of the most legit programmers. I know.
It's like, he builds stuff from scratch a lot.
And that's how he builds intuition about how a problem works.
He's built it from scratch, and I always love that.
And the primary language he uses is Python for the intuition building.
But he posted something on Twitter saying that they got a significant improvement on some aspect of their data loading,
I think, by switching away from np.square root, so the non-pies implementation of square root
to math that square root. Then somebody else commented that you can get even a much greater improvement by using the vanilla
Python square wood which is like power 0.5 power 0.5. Yeah, and it's fascinating to me. I just wanted to
So that that was absolutely I'm not that was some shade throwing at some no no and but also just we're talking about it's a good With ask the trade-off between usability and efficiency
a good way to ask the trade-off between usability and efficiency broadly in NumPy, but also on these specific weird quirks of a single function.
Yep.
So, on that point, if you use a NumPy Math function on a scalar, it's going to be slower
than using a Python function on that scalar.
Right?
Because the math object in NumPy is more complicated, because you can also call that
math object on an array.
And so effectively, it goes through a similar machine.
There aren't enough of the, which you would do in a, you could do like checks and fast
paths.
So yeah, if you're basically doing a list, if you run over a list, in fact, for problems that are less than a thousand, even maybe 10,000 is probably the, if you're going more than 10,000, that's where you definitely need to be using arrays.
But if you're less than that, and for reading, if you're doing a reading process, and essentially it's not compute bound, it's IO bound.
And so you're really taking lists of thousands of time and doing work on it. Yeah, you could be faster just using Python.
It's pretty good Python.
See, but also, and then this is the society in trouble.
There's the fundamental questions when you look at the long arc of history.
It's very possible that NP.score is much faster.
It could be.
So like in terms of like don't worry about it.
It's the the evils of over optimization or whatever.
All the different quotes around that is sometimes obsessing about this particular little cork
is not sufficient.
Like, for somebody like, if you're trying to optimize your path, I mean, I agree, premature
optimization creates all kinds of challenges, right?
Because now, but you may have to do it.
I believe the quote is it's the root of all.
It's the root of all evil, right?
I mean, let's skip Donald Newton, I think, or maybe more than somebody else.
He's, well, Doc Newton is kind of like Mark Twain.
He was just a tribute shot.
No matter.
And we, and it's fine because it's brilliant.
So no, I was a Latech user myself.
And so I have a lot of respect.
And you did more than that, of course, but yeah, I, a lethec user myself and so I have a lot of respect and you did more than that of course, but
Yeah, I'm someone I really appreciate in the computer science space. Yeah, I don't I think that's appropriate
There's a lot of little things like that where people actually if you understood it you go. Yeah, of course that's the case
Yeah, like and the other part and the other part I didn't mention and
Number I was a thing we wrote early on and I was really excited by number because it's something we wanted
It was a compiler for Python syntax.
And I wanted it from the beginning of not writing Numba
because of this function question.
Like taking the power of arrays is really
that you can write functions using all of it.
It has implicit looping.
Right, so you don't worry about it.
This is an dimensional for loop with four loops,
four for statements.
You just say, oh, big for dimensional array,
I'm gonna do this operation, this plus, this minus,
this reduction, and you get this,
it's called vectorization in other areas,
but you can basically think at a high level
and get massive amounts of computation done.
With the added benefit of, oh, it can be paralyzed easily.
It can be put in parallel, you don't have to think about that.
In fact, it's worse to go decompose your,
you write the for loops and then try to infer
parallelism from for loops.
That's actually a harder problem than to take the array problem
and just automatically paralyze that problem.
That's what, and so functions in NumPy
are called universal functions, UFUNC.
So square root is an example of a UFUNC.
There are other sine, cosine, ads, attract.
In fact, one of those first libraries to sci-pi
was something called special, where I added
Bessel Functions and all these special functions
that come up in physics.
And I added them as u-funks, so they could work on a race.
So I understood u-funks very, very well from day one
inside of New America.
That was one of the things we tried to make better
in numpy was how do they work?
Can they do broadcasting?
What does broadcasting mean? But one of the
problems is, okay, what do I do with a Python scalar? So what happens to Python scalar gets
broadcast to a zero-dimensional array, and then it goes to the whole same machine as if
it were a 10,000 dimensional array, and then it kind of unpacks the element and does the
addition. That's not to mention the function it calls,
in the case of square root, is just the C-Libesquare root.
In some cases, like Python's power,
there's some optimizations they're doing
for that can be faster,
they're just calling the C-Libesquare root.
In the interpreter or the...
No, in the C code, in the Python runtime.
In the Python, so they're, they really optimize it,
and they have the freedom to do that
because they don't have to worry about.
Is this just a scalar?
It's just a scalar.
Right, they don't have to worry about the fact
that oh, this could be an object with many, you know,
many pieces.
They're not, the UFUNC machine is also generic in sense
that typecasting and broadcasting, broadcasting is the idea
of I'm going to go, I have a zero dimensional array,
I have a scalar with a four dimensional array,
and I add them. Oh, I have a zero dimensional array, I have a scalar with a four dimensional array and I add them.
Oh, I have to kind of, of course, the shape of this guy to make it work against the whole
four dimensional array.
So say, idea of I can do a one dimensional array against a two dimensional array and
have it make sense.
Well, that's what Numpai does is a challenge you to reformulate, rethink your problem.
Yes.
As a multi dimensional array problem versus like move away from scalers
completely.
Right.
Exactly.
In fact, that's where some of the edge cases boundaries are, is that, well, they're still
there, and this is where array scalers are particular.
So array scalers are particularly bad in the sense that they were written so that you
could optimize the math on them, but that hasn't happened.
Right.
And so their default is to you is to coerce the array scalar to a zero dimensional array and then use the num-pim-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m-m- We're lying a little bit in the sense that, well, first do the 40x slowdown of using
a race scaler is inside of a loop.
Because if you used Python skaters, you'd already be 10 times faster.
But then we would get 100 times faster over that using just compilation.
And what we do is compile the loop from out of the interpreter to machine code.
And then that's always been the power of Python is this extensibility so that you can,
because people say, oh, Python's so slow, well, sure, if you do all your logic in the runtime of the Python
interpreter, yeah. But the power is that you don't have to. You write all the logic
which you do in the high level is just high level logic. And the actual calls you're making
could be on gigabyte arrays of data. And that's all done at compiled speeds. And the fact
that integration is one can happen, but two is separable. That's one of the
there's language like Julia says we're gonna be all in one you can do all of it together and then there's
the jury's out is that possible. I tend to think that you're gonna
there's separate concerns there. You want to pre compile and but generally you will want to pre compile your
some of your loops like sci-fi is a compilation step to install sci-fi.
It takes about two hours.
If you have many machines, maybe you can get it
out into one hour, but to compile all those libraries
takes about, takes a while.
You don't want to do that at runtime.
You don't want to do that all the time.
You want to have this precompiled binary available
that you're then just linking into.
So there's real questions about the whole source code.
Code is running, binary code is more than source code.
It's created, object code is the linker, it's the loader, it's the how does that interpret
it inside of virtual memory space.
There's a lot of details there that actually I didn't understand for a long time until
I read books on the topic.
And it led to, the more you know, the better off you are.
And you can do more details, but sometimes it helps with abstractions,
too. Well, the problem, as we mentioned earlier with abstractions, is you kind of sometimes assume
that whoever implemented this thing had your case in mind and found the optimal solution.
Yes. Or like you assume certain things. I mean, there's a lot of...
Correct. or like you assume certain things. I mean, there's a lot of, one of the really powerful things to me early on,
I mean, it sounds silly to say,
but with Python, probably one of the reasons
I fell in love with it is dictionaries.
Yes.
So obviously, probably most languages have some,
mapping concept.
Some mapping concept, but it felt like it was a first class
citizen and it was just my brain was able to think in dictionaries
But then there's the thing that I guess I still use to this day's order dictionaries
Because that seems like a more natural way to construct dictionaries and and the formal computer science perspective the running time costs
It's not that significant. There's a lot of things to
Understand about dictionaries that the abstraction
kind of doesn't necessarily incentivize you to understand.
Right. Do you really understand the notion of a hash map and how that dictionary is implemented?
But you're right. Dictionaries are a good example of an abstraction that's powerful.
And I agree with you. I love I agree. I love dictionaries too. It took me a while to understand
that what you do, you realize, oh, they're everywhere.
And Python uses them everywhere too.
It's actually constructed of one of the foundational things as dictionaries, and it does everything
with dictionaries.
So it is.
It's powerful.
Order dictionaries came later, but it is very, very powerful.
It took me a little while coming from just the array programming entirely to understand
these other objects, like dictionaries and lists and tuples and binary trees. I guess that wasn't a computer scientist, but I
studied arrays first. And so I was very array-centric. And you realize, oh, these
others don't have purposes and value actually. I agree.
There's a friendliness about like one way to think about a raise is a raise is just not like full of numbers, but to make them accessible
to humans and make them less air prone to human users. Sometimes you want to attach names,
human interpretable names that are sticky to those arrays. So that's how you start to
think about dictionaries is. Yeah, good point. You start to convert numbers into something that's human-interpretable.
And that's actually the tension I've had with Numpi, because I've built so much tooling
around human-interpretability and also protecting me from a year later not making the mistakes
by being, I wanted to force myself to use English versus numbers.
Yes.
There's a project called Label the Race.
Like very early it was recognized that, oh, we need, we're indexing numpy, we're just
numbers, all the columns and particularly the dimensions.
I mean, if you have an image, you don't necessarily need a label each column or row, but if you
have a lot of images or you have another dimension, you at least like to label
the dimension as this is x, this is y, z, or this is, give it some human meaning or some
domain-sook meaning.
That was one of the impetus for pandas, actually.
It was just, oh, we do need to label these things.
Label array was an attempt to add that, like, a lighter weight version of that.
And there's been, like, that's an example of something I think NumPy could add, could be added to NumPy,
but one of the challenges again, how do you fund this?
Like I said, one of the tragedies I think is that,
so I never had the chance to,
I was never paid to work on NumPy, right?
So I've always just done it my spare time.
I've always taken from one thing,
taken from another thing to do it.
And at the time, I mean, today,
it would be the wrong day
and today, like, pay me to work on NMPI now would not be
a good use of effort, but we are finally at QuantSight Labs,
I'm actually paying people to work on NMPI inside Pi,
which is I'm thrilled with, I'm excited by.
I wanted to do that.
That's why I always wanted to do from day one.
It just took me a while to figure out a mechanism to do that.
Even like in the university setting, respecting that, pushing students, young minds, the young
graduate students to contribute, and then figuring out financial mechanisms that enabled them
to contribute, and then sort of reward them for their innovative scientific journey that
that would be nice.
But then also just the better allocation of resources.
Well, you know, it's 20-year anniversary, since I 11, and I was just looking, we spent over
$6 trillion in the Middle East after 9-11 in the various efforts there.
And sort of to put politics and all that aside, it's just you think about the education system,
all the other ways we could have possibly allocated that money. to put politics and all that aside, it's just you think about the education system,
all the other ways we could have possibly allocated that money.
To me, to take it back,
the amount of impact you would have
by allocating a little bit of money to the programmers
that build the tools that run the world is fascinating. I mean, it is. I
don't know. I think again, there is some aspect to being broke as somewhat of a feature not a bug that
you make sure that you manage that. Right. No, I know. I so I but I don't think that's a big part. So it's like I think
you can you can have enough money and actually be wealthy while maintaining your values
Agreed I think agreed. There's an old adage that you know nations that trade together don't go to war together
Yeah, I've often thought about you know nations that code together. Yeah, good
Because one thing I love about open source is it's global. It's multinational like there aren't national boundaries
One of the challenge with business and open source
is the fact that business is national.
Businesses are entities that are recognizing
legal jurisdictions and have laws that are respected
in those jurisdictions and hiring,
and yet the open source ecosystem is not there.
Like currently, one of the problems we're solving
is hiring people all over the world.
Because it's a global effort,
and I've had the chance to work and I've over the world. Because it's a global effort. And I've had the chance to work,
and I've loved the chance.
I've never been to a ran,
but I once had a conference
where I was able to talk to people there,
and talk to folks in Pakistan.
I've never been there, but we had a call
where there are people there,
like just scientists and normal people.
And there's a certain amount of humanizing that gets away from the, we have to get the
memes of society that bubble up and get discussed, but the memes are not even an accurate reflection
of the reality of what people are.
If you look at the major power centers that are leading to something like cyber war
in the next few decades.
It's United States, Russia, and China.
And those three countries in particular have incredible developers.
So if they work together, I think that's one way the politicians can do their stupid bickering.
But like there's a layer of infrastructure, of humanity.
If they collaborate together, that I think can prevent major, major,
mostly a conflict, which would,
I think most likely happen at the cyber level
versus the actual hot war level.
You're right.
No, I think that's good prediction.
Nations that code together,
they don't go together.
They don't go together.
That's a hope, right?
That's one of the philosophical hopes, but yeah
So you mentioned the project of number which is
Fascinating so from the early days there's kind of a pushback on Python that it's not fast
You know you see see but if you want to write something that's fast you see C++ if you want to write something
That's usable and friendly,
but slow, you use Python.
And so what is a number?
What is its goal?
How does it work?
Great.
Yeah.
Yes, that's what the argument.
And the reality was people would write high-level coding
and use compiled code, but they're still user-store use cases
where you want to write Python, but then have it still be fast.
You still need to write a for loop.
Like before NUMBA, it was always, don't write a for loop.
Write it in a vectorized way, put it in a ray.
And often that can make a memory trade off.
Like, quite often you can do it, but then you make,
maybe you use more memory because you have to build this array of data
that you don't necessarily need all the time.
So NUMBA was, it started from a desire to have kind of a
vectorize that worked.
A vectorize was a tool in NumPy, it was released.
You get it a Python function, and it gave you a universal
function, a UFUNC, it would work on a race.
So you get the function that just worked on a scalar,
like you could make a, like the classic case was a simple
function that had an if-then statement in it.
So, sine x over x function, sync function.
If x equals zero, return one,
otherwise do sine x over x.
The challenge is, you don't want that loop
I go in and Python, so you want to compile a version of that.
But the vectorized and numpy would just
give you a Python function.
So it would take the array of numbers,
and at every call, do a loop back into Python.
So it was very slow. It gave you the appearance of a loop back into Python. So it was very slow.
It gave you the appearance of a UFUNC,
but it was very slow.
So I always wanted a vectorize that would take
that Python scalar function and produce a UFUNC
working on binary native code.
So in fact, I had somebody work on that with PyPy.
You see if PyPy could be used to produce a UFUNC
like that early on in 2009 or something like that, 2010.
They didn't work that well. It was kind of pretty bulky. early on in 2009 or something like that, 2010.
They didn't work that well. It was kind of pretty bulky.
But in 2012, Peter and I just started Anaconda.
We had, I just, I'd learned to raise money.
That's a different topic, but I'd learned to raise money
from friends, family, and fools, as they say.
And.
That's a good line.
That's a good line.
Right.
Oh, that's a good line. But, you know, so so I we're trying to do something. We were trying to change
the world. Peter and I are super ambitious. We wanted to make array computing and we had ideas
for really what's still still the energy right now. How do you do at scale data science? We had a
bunch of ideas there. But one of them, I had just talked to people by L of M and I was like,
there's a way to do this. I just, I went, I heard about my friend
at Beasley at a compiler course.
So I was looking at compilers like,
and I was like, oh, this is what you do.
And so I wrote a version of Numba
that just basically mapped Python bytecode to LVM.
Nice. Right.
So, and the first version is like this works
and it produces code that's fast.
This is cool.
Obviously, it reduced subset of Python.
I didn't support all the Python language.
There had been efforts to speed up Python in the past, but those efforts were, I would say,
not from the array computing perspective, not from the perspective of wanting to produce
a vectorized improvement.
They were from a perspective of speeding up the runtime of Python, which is fundamentally
hard because Python allows for some constructs
that aren't, it can't speed up.
It's this generic, you know, when it does this variable.
So I, from the start, did not try to replicate Python's semantics entirely.
I said, I'm going to take a subset of the Python syntax and let people write syntax in
Python, but it's kind of a new language, really.
It's almost like four loops, like focusing on four loops, scalar arithmetic,
you know, typed, you know, really typed language,
a type subset.
That was the key.
So we wanted to add inference of types.
So you didn't have to spell all the types out
because when you call a function,
so Python is typed, it's just dynamically typed.
So you don't tell it where the types are,
but when it runs, every time an object runs, there's a type for the variables.
You know what it is. And so that was the design goals of NUMBA were to make
a possible to write functions that could be compiled and have them use for
NUMPyRays. Like the need to support NUMPyRays.
And so how does it work? You had a comment within Python that tells you to do,
like how do you help out compiler?
Yeah, so there isn't much actually.
You don't.
It's kind of magical in the sense that just looks at
the type of the objects and then
as type inference to determine any other enemy of variables it needs.
Then it was also because we had a use case that could work early.
Like one of the challenges of any kind of new development is if you have something that
to make it work, it was going to take you a long time, it's really hard to get out
of the ground.
If you have a project with some incremental story, it can start working today and solve
a problem, then you can start getting it out there, getting feedback.
Because number today, now number is nine years old today, right?
The first two, three versions were not great, right?
But they solved a problem and some people could try it.
We could get some feedback on it.
Not great, and it was very focused on the very fragile,
very stuff-
I was a fragility.
The subset it would actually compile was small.
And so if you wrote Python code and said,
so the way it worked is you write a function,
and you say, app, jit, use decorators.
So decorators, just these little constructs
let you decorate code with an app and then a name.
The app jit would take your Python function and actually
just compile it and replace the Python function
with another function that interacts with this compile
function.
And we just do that.
And we went from Python, bytecode, we then went to AST.
I mean, writing compiler is actually,
I learned a lot about why computer science
is taught the way it is because
compilers can be hard to write.
There's only used trace structures,
they use all the concepts of computer science
that are needed.
And it's actually hard to,
it's easy to write a compiler and have a be spaghetti code.
Like the passes become challenging.
And we ended up with three versions of NUMBA.
NUMBA got written three times. get a code like the passes become challenging and we ended up with three versions of number, right?
Number got written three times.
What programming languages number written in?
Python.
Wait, okay.
Yeah, Python.
So really?
Yeah.
It's a fascinating.
Yeah, so Python, but then the whole goal of number is to translate Python byte code to
LVM.
And so LVM actually does the code generation.
In fact, a lot of times I'd say,
yeah, it's super easy to write a compiler if you're not writing the parser, nor the code generator.
Right. So for people who don't know, LLVM is a compiler itself, so you're compiling it.
It's really badly named, low-level virtual machine, which that part of it is not used.
It's really low-level. Chris, he doesn't mean that.
Yeah, love Chris. But the name makes you imply that the virtual machine is what it's all about.
It's actually the IR and the library that the code generation is.
That's the real beauty of it.
The fact that what I love about LVM was the fact that it was a plateau you could collaborate
on.
Right?
Instead of the internals of GCC or the internals of the Intel compiler, like how do I extend
that?
And it was a place where you collaborate.
And we were early, I mean, people had started before.
It's a slow compiler, like it's not a fast compiler.
So for some kind of JITs, like JITs are common in language
because when every browser has a JavaScript JIT,
it does real-time compilation of the JavaScript
to machine code.
For people who don't know JIT is just in time compilation. Thank you, yeah, just in time compilation. the JavaScript to machine code. For people who don't know, JIT is just-in-time compilation.
Thank you, yeah, just-in-time compilation.
They're actually really sophisticated.
In fact, I got jealous of how much effort was put into the JavaScript JIT.
Yes.
Well, it's kind of incredible what they've done with JavaScript JIT.
I completely agree. I'm very impressed.
But, you know, number was in it, was an effort to make that happen with Python.
And so we used some of the money we raised from Anticonda to do it. So, NUMBA wasn't an effort to make that happen with Python.
So we used some of the money we raised from Anticon to do it, and then we also applied
for this DARPA grant and used some of that money to continue the development.
Then we used proceeds from service projects we would do.
We get consulting projects that we would then use some of the profits to invest in NUMBA.
We ended up with a team of two or three people working on NUMBA.
It was a Fits and starts, right?
And ultimately, the fact that we had a commercial version of it, also we were writing.
So part of the way I was trying to fund numbers, say, well, let's do the free number
and then we'll have a commercial version of number called number pro.
And what number pro did is it targeted GPUs.
So we had the very first coup d'etre in the very first at-git compiler that in 2013, you could run not just a view
funk on CPU, but a view funk on GPUs. And it was automatically paralyzed it and get
thousand X speed up. And that's an interesting funding mechanism because, you know, large
companies or larger companies care about speed in just this way.
So it's exactly a really good way.
Yeah, there's been a couple of things you know people will pay for.
One they'll pay for really good user interfaces.
Right?
And so I'm always looking for one of the things people will pay for,
that you can actually adapt to the open source infrastructure.
One is definitely user interfaces.
The second is speed, like a better run time, faster run time.
And then when you say people, you mean like a small number of people pay a lot of money,
but then there's also this other mechanism that does true a ton of people pay.
That's true.
A little bit.
First, we mentioned Anaconda, we mentioned friends, family and fools.
So Anaconda is yet another, so there's a company, but there's also a project that is exceptionally
impactful in terms of, for many reasons, but one of which is bringing a lot more people
into the community of folks who use Python.
So what is Anaconda?
What is its goals?
Maybe what is Konda versus anaconda?
Yeah, tell you a little bit of the history of that.
Because anaconda, we wanted to scale Python.
Because Peter and I had the goal of when we started anaconda.
We actually started as continuum analytics.
It was the name of the company that started.
It got renamed to anaconda in 2015.
But we said we want to scale analytics.
NumPy's great, Pan is emerging, but these need to run at scale with loss of machines.
The other thing we wanted to do was make user interfaces that were Web,
wanted to make sure the Web did not pass by the Python community,
that we had a ways to translate your data science to the Web.
So those are the two kind of technical areas we thought,
oh, we'll build products in this space.
And that was the web. So those are the two kind of technical areas we thought, oh, we'll build products in this space.
And that was the idea.
Very quickly in, but of course, the thing I knew how to do was to do consulting to make
money and to make sure my family and friends and the whole city invested didn't lose their
money.
So it's a little different than if you take money from a venture fund.
If you take money from a venture fund, the venture fund, they want you to go bigger
or home.
They're kind of like expecting 9 out of 10 to fail
or 99 out of 100 to fail.
It's different, I was at a barbell strategy.
I was like, I can't fail.
I mean, I mean, I do super well,
but I cannot lose their money.
So I'm gonna do something I know
can return a profit, but I wanna have exposure
to an upside.
So that's what happened on a condo.
There was lots of things we did not well in terms of that structure and I've learned from since it had it better. But we did
a really good job of kind of attracting the interest around the area to get good people
working and then get funnel some money on some interesting projects. Super excited about
what came out of our energy there. Like a lot did.
So what are some of the interesting projects? So, desk, number, okay, Konda. There was a data shader, panel,
holo viz. These are all tools that are extremely relevant in terms of helping you build applications,
build tools, build faster code. There's a lot of getting.
The juper lab, juper lab came out of this too.
And that's the thing. Yeah juper lab. Juper lab came out of this too. That's fascinating.
Okay, so, uh, well, Bokeh, does plotting, is that?
Bokeh does plotting.
So Bokeh was one of the foundational things to say, I want to do plot in Python, but have
the things show up in a web.
Right.
That's right.
That's right.
Right.
So, plotting to me, still, what I'll do is expect the mat plot, lib and Bokeh.
It feels like still in on solve problem.
Not on solve problem. It is. It in on soft problem. Not on soft problem.
It is. It's a big problem.
Right. Because your, I mean, I don't know,
it's visualization broadly.
Yes. I think we got a pretty good API story
around certain use cases of plotting.
Yeah. But there's a difference between static plots
versus interactive plots versus I'm a end user,
I just want to write a simple, you know,
pandas started the idea of here's a data frame on a dot plot. I'm just going to attach plot as a
method to my object, which was a little bit controversial, right? But works pretty well actually,
because there's a lot less you have to pass in, right? You can just say here's my object, you know
what you are, you tell the visualization what to do. So, and there's things like that that have not been super well developed entirely, but
Bokeh was focused on interactive plotting.
So it's a short path between interactive plotting and application, dashboard application.
And there's some incredible work that got done there, right?
And it was a hard project because then you're basically doing JavaScript and Python.
So we wanted to tackle some of these hard problems and try to and just go after them.
We got some DARPA funding to help.
And it was super helpful.
Funny story there.
We actually did two DARPA proposals, but one we were five minutes late for.
And DARPA has a very strict cutoff window.
And so we had two proposals, one for the bouquet and one for actually number and the other work.
Which one we laid for the foundation on the miracle work.
So, bouquet got funded. Fortunately, Chris let us use some of the money to fund
still some of the other foundational work.
But it wasn't as, yeah, he his hands were tired.
He couldn't do anything about it.
That was a whole interesting story.
So, one of the incredible projects that you worked on is Konda.
Yes. So, what is Konda?
How was that came about? Yeah, Konda, it was early on, I can say it was sci-fi.
Sci-fi was a distribution of mascaras in the library.
And he said, he heard me talking about compiler issues and trying to get the stuff shipped
and the fact that people can use your libraries if they have it.
So for a long time, we'd understood the packaging problem in Python.
And one of the first things you did in a continuum analytics became an Acona,
was organize the PyData ecosystem in conjunction
with numfocus.
We actually started numfocus with some other folks
in the community the same year we started an Acona.
I said we're gonna build a corporation,
but we also gotta re-ify the community aspect
and build a nonprofit.
So we did both of those.
Can we pause real quick and can you say what is pi pi, the Python package index like this
whole story of packaging in Python?
Yeah, that's what I'm going to get to actually.
This is exactly the journey I'm on.
It's just sort of explain packaging in Python.
I think it's best expressed to the conversation I have with Gito at a conference where I said,
so you know, packaging is kind of a problem.
And Gito said, I don't ever care about packaging.
I don't use it.
I don't install new libraries.
I'm like, I guess if you're the language creator,
and if you need something, just put it in the distribution,
maybe you don't worry about packaging.
But Gito has never really cared about packaging, right?
And never really cared about the problem of distribution.
Somebody else has problem.
And that's a fair position to take, I think,
as a language creator. In fact And that's a fair position to take, I think,
as a language creator.
In fact, there's a little soft little question about,
should you have different development packaging managers?
Should you have a package manager per language?
Is that really the right approach?
I think there are some answers of,
it is appropriate to have development tools.
And there's an aspect of a development tool
that is related to packaging.
And every language should have some story there
to help the developers create.
So you should have language-specific
development tools.
Language-development tools that relate to package managers.
But then there's a very specific user story
around package management that those language-specific
package managers have to interact with.
And currently aren't doing a good job of that.
That was one of the challenges that did not
seen that difference and still exists
in the difference today. Condo always was a user, I'm going to use Python to do data science.
I'm going to use Python to do something. How do I get this installed? I was always focused on that.
So it didn't have a like a develop. Class of example is PIP has a PIP develop. It's like, I want
to install this into my current development environment today.
Conno's not having that concept because it's not part of the story. For people who don't know,
pip is a pie thought on the specific packaging manager. That's the exception popular. That's probably like the default thing. It's the default user. Yeah,
it's sort of the story there emerged because what happened is in 2012 we had this meeting at the Google, Googleplex and Gito was there to come talk
about what we're going to do.
Are we going to make things work better?
And Wes McKinney, me, Peter, Peter has a great photo of me talking to Gito and he pretends
we're talking about this story and maybe we were, maybe we were, but we did at that meeting
talking about and ask Gito, we need to fix packaging and piping up.
People can't get the stuff.
And he said, go fix yourself.
I don't think we're gonna do it.
All right.
The origin story right there.
All right.
You said, okay, you said to do this ourselves.
So at the same time, people did start to work
on the packaging story in Python.
It just took a little longer.
So in 2012, kind of motivated by our training courses
we were teaching, like how do we, like very similar to what you just mentioned about your mother, like it was
motivated by the same purpose. Like, how do we get this into people's hands? And it's
this big, long process that takes to expensive. It was actually hurting NumPy development
because I would hear people were saying, don't make that change the NumPy because I just
spent a week getting my Python environment. And if you change, if you change NumPy,
I have to reinstall everything and reinstalling such a pain.
Don't do it.
Wait, okay.
So now we're not making changes to a library because of the installation
problem that will cause for end users.
Okay, there's a problem with back.
There's a problem with installation.
We've got to fix this.
So we said we're going to make a distribution of Python.
And we've previously done that.
Previously done that at MThought.
I wanted to make one that would give away for free.
Everyone could just get it.
Like, it was critical that we could just get it.
It wasn't tied to a product.
It was just you could get it.
And then we constantly thought about,
well, do we just leverage RPM?
But the challenge had always been,
we want to package managers that works on Windows,
Mac OS X, and Linux the same, right?
And it wasn't there.
Like, you don't have anything like that.
You have it.
And for people who don't know, RPM is.
Red Hat's Package Manager.
Operating System Specific Package Manager.
Correct.
It's an operating specific, yes.
So do you create the design that,
questions do you create an umbrella package manager
that's cross operating system?
Yes, that was the decision.
And a neighboring design question is,
do you also create a package manager
that spans multiple programming languages? Correct. Exactly. That was the world we faced and we decided
to go multiple operating systems, multiple and programming language independent because even
Python, and particularly what was important was SciPy has a bunch of 4chan in it, right? And
Scikit-learn has links to a bunch of C++. There's a lot of compiled code. And the Python package manager,
especially early on, didn't even support that.
So in 2000, so we released an Aconda,
which is just a distribution of libraries,
but we started to work on Conda in 2012.
First version of Conda came out in early 2013,
which was summer of 2013,
and it was a package manager.
So you could say, Conda install, psychic learn.
In fact, that was the,
psychic learn was a fantastic project that emerged.
Kind of, it was the classic example of the psych kits.
I still talked to me earlier about sci-fi being too big
to be a single library.
Well, what the community had done is said,
let's make psych kits and their psychic image
or psychic learn.
There's a lot of psych kits.
And it was a fantastic move,
the community did, I didn't do it.
I was like, okay, that's good idea.
I didn't like the name.
I didn't like the fact you type psychic image.
I was like, that's gonna be simpler.
SK learn, we gotta make that smaller,
like typing all this stuff, my imports.
So I was kind of a pressure that way,
but I love the energy and love the fact they went out
and they did it and Doss people, Jared Milliman
and then of course, Gael and there's people
I'm not even naming that psychic learn really emerged a fantastic project and the documentation
around that is also incredible. And the fact that he was incredible. Exactly. I don't know who did that but
they did a great job. A lot of people in in Rhea a lot of people like a lot of European contributors
Andreas there's some Andreas in the US there's a lot of just people I just adore I think are amazing
people. Awesome use of sci-fi right I just adore, I think are amazing people.
Awesome use of sci-fi, right?
I love the fact that they were using sci-fi.
I effectively do something I love,
which is machine learning, but can install it.
Because there's so many, so many dependencies, right?
So our use case of Konda was Kond install, sci-fi learn.
Right.
And it was the best way to install second learn in 2013 to really 2018,
17, 18,
pip finally caught up. I still think it's you should condenced all second learn for the
pip install secular, but you can't dip install second learn. The issue is the package they
created was wheels and pip does not handle the multi vendor approach. They don't handle
the fact you have C++ libraries you're depending on.
They just stop at the Python boundary.
And so what you have to do in the wheel world
is you have to vendor.
You have to take all the binary and vendor it.
Now, if your change happens in an early dependency,
you have to redo the whole wheel.
So TensorFlow is a good example of a,
you should not PIP install TensorFlow.
It's a terrible idea.
People do it because the popularity of PIP,
many people think, oh, of course that's how I install
everything Python.
Yeah, this is one of the big challenges.
You take a GitHub repository or just a basic blog post.
The number of time PIP is mentioned over Konda
is like 100x to one.
Correct.
So they just have to increase.
And that was increasing.
It wasn't true early because PIP didn't exist.
Like Honda came first. So but that's like the long tail of the internet documentation.
Correct. User generated. So that like you think, how do I install, you Google, how do I install
TensorFlow? You're just not going to see conda in that first page.
Right. Exactly. And that today you would, you would have in 2016, 2017.
And it's sad because you saw the condo solves a lot of usability issues.
Correct.
Like for especially super challenging thing, I don't know.
One of the big pain points for me was just on the computer vision side, open CV installation
that for example, I think I don't know if condo solved that.
How does it open CV package?
I don't know.
I certainly know. I certainly
know. Pip has not solved. I mean, there's complexities there because right. Actually, don't know. I
should probably know a good answer for this. But, you know, if you compile open CV with certain
dependencies, you'll be able to do certain things. So there's this kind of flexibility of what you like what parent what options you compile with
Yes, and I don't think it's trivial to do that in a with condo or or or
So it kind of has a notion of variance of a package
You can actually have different compilation versions of a package
So not just the versions different but this is compiled with these optimizations on so kind of does have an answer as a flavors
Yeah, flavor is basically a good one. I as far as I know does not have no no compile with these optimizations. So Kana does have an answer. Has flavors. Has flavors basically have.
Well, as far as I know, does not have.
No.
PIP generally hasn't thought deeply about the binary dependency
problem.
That's why fundamentally, it doesn't work for the sci-fi ecosystem.
It barely, you can sort of paper over it
and duct tape and it kind of works until it doesn't,
it falls apart entirely.
So it's been a mixed bag. Like, and I've been having lots of conversations with people
over the years because again, it's an area where if you understand some things, but not
all the things, but they've done a great job of community appeal. This is an area where
I think Anaconda as a company needs to do some things in order to make condom more community
centric, right? And this is a, I talk about this all the time, there's a balance between, you
have every project stars to what I call company back to open source, even if the company
is yourself, is just one person, you know, doing business as. But ultimately for projects
to succeed, virally, and become massive influencers, they have to create, they have to get
community of people on board, they have to get community of people on board.
They have to get other people on board.
So it has to become community driven.
And a big part of that is engagement with those people, empowering people, governance around it.
And what happened with Kahn in the early days,
I'll pip emerged, and we did do some good things, Kahn to Forge.
Kahn to Forge community is sort of the community recipe creation community.
But Kahn itself, I am still believe, and Peter is CEO of Anaconda's, my co-founder.
I ran Anaconda till 2017, 2018.
It's Peter's still an Anaconda, right? We're still great friends. We talk all the time.
I love him to death. There's a long story there about like why and how, and we can cover in some
some other podcast perhaps. They sort of more, maybe a more business focused one,
but there's one area where I think
Konda should be more community driven.
Like he should be pushing more to get more community
contributors to Konda and let the,
not like Anna Konda shouldn't be fighting this battle.
Yeah, right.
It's actually it's really a developer.
So you know, exactly you said, help the developers, and then
they'll actually move us the right direction.
That was the problem I have as many of the cool kids I know don't use Kanda. That to me
is confusing. It is confusing. It's really a matter of, Kanda has some challenges. First
of all, it still needs to be improved. There's lots of improves we made. It's that aspect
of weight who's doing this. The fact that then the PIPA really stepped up.
Like they were not solving the problem at all.
And now they kind of got to where they're solving it
for the most part.
And then effectively you could get,
like Konda solved a problem that was there.
And it still does.
And it's still, there's still great things it can do.
But, and we still use it all the time at Quonsight
and with other clients, but with,
but you can kind of do similar
things with PIP and Docker, right?
So especially with the web development community, the part of it again is there's a lot of
different kind of developers in the Python ecosystem.
And there's still a lack of some clear understanding.
I go to the Python conference all the time and that there's only a few people in the pipe
EA who get it.
And then others who are just massively
trumpeting the power of PIP,
but just do not understand the problem.
So one of the obvious things to me
from a mom, from a non-programmer perspective
is the across operating system usability
that's much more natural.
So there's really use a windows
and just it seems much easier to recommend condo there,
but then you should also recommend it across the board.
So I'll definitely sort of,
when you recommend now as a hybrid, I do,
I mean, I have no problem with PIP.
Is it possible to use PIP?
Oh, it is, it is.
When I, like, build the environment with PIP,
with Konda, build an environment with Konda,
and then PIP install on top of that, that's fine.
Be careful about PIP installing OpenCV,
or TensorFlow, or, because if somebody's allowed that,
it's going to be most certainly done in a way
that can't be updated that easily.
So install the big packages, the infrastructure,
Wakanda, and then the weirdos,
the weird implementation for some header.
There's a cool library I used that,
based on your location, at and time of day and day,
it tells you the exact position of the sun relative to the Earth.
It's just like a simple library, but it's very precise.
I was like, all right, that was, and it's like, pip.
Well, the thing they did really well is Python developers who want to get their stuff published,
they, you have to have a pip recipe. Yeah. Well, the thing they did really well is Python developers who want to get their stuff published,
they, you have to have a PIP recipe.
Right?
I mean, even if it's, you know, the challenge is, and there's a key thing that needs to be
added to PIP, just simply add to PIP the ability to defer to a system package manager.
Like, because it's, you know, recognize you're not going to solve all the dependency problem.
So let, like, give up and allow a system
packageer to work.
That way, Anaconda is installed, and it has PIP.
It would default to Condo to install stuff,
but Red Hat RPM would default to RPM
to install some more things.
Like, that's a key, not difficult, but somewhat where.
Some work feature needs to be added.
That's an example of something like,
I've no need to root and do it.
It's where I wish I had more
money. I wish I was more successful in the business side,
trying to get there. But I wish my, you know, my family,
friends and full community that I know was larger, was larger
and had more money. Because I know tons of things to do,
effectively with more resources. But, you know, I have not
yet been successful to channel tons of it. I was some, you
know, I'm happy with what we've done.
We've created again at Quonsight what we created to get
an economy started.
We created a community that's going to kind of started,
done it again with Quonsight, super excited by that.
By the way, it took three years to do it.
What is Quonsight?
What is its mission?
We've talked a few times about different fascinating aspects
of it, but let's like big picture.
What is Quonsight?
Big picture of Quonsight, it's vicious to connect data
to an open economy.
So it's basically consulting the Python ecosystem.
It's a consulting company.
And what I've said when I started it was,
we're trying to create products, people, and technology.
So it's divided into two groups,
and a third one as well.
The two groups are a consulting services company
that just helps people do data science
and data engineering and data management better.
And more efficient.
Full stack, like full stack, data science,
full thing, what will help you build an infrastructure
if you're using Jupyter, we need,
we do staff orientation, need more programmers,
help you use DASM, effectively help you use GPUs,
more effectively, just basically a lot of people need help.
So we do training as well to help people, you know, both immediate help and then get,
get, learn from somebody.
We've added a bunch of stuff too.
We kind of separate some of these other things in another company called Open Teams that
I'm kind of, we currently started.
One of the things I loved about we did at Anaconda was creating a community innovation team.
And so I wanted to replicate that.
This time we did a lot of innovation in Anaconda.
I wanted to do innovation, but also contribute
to the projects that existed.
Like create a place where maintainers,
so that SciPy and NumPy and all these projects
we already started can pay people to work on them
and keep them going.
So that's Labs.
Quonsite Labs is a separate organization.
It's a nonprofit mission.
The profits of Quonsite help fund it.
And in fact, every project that we have at
Quonsight, a portion of the money, goes directly to Quonsight
Labs to help keep it funded. So we've gotten several
mechanisms and we keep Quonsight Labs funded. And currently,
so I'm really excited about labs, because it's been a mission
for a long time.
What kind of projects are within labs? So labs is working to make
the software better, like make NumPy better, make SciPy better,
make it's it's only works on open source.
So if somebody wants to, so companies do,
we have a thing called a community work order, we call it.
If a company says, I wanna make Spider better, okay, cool.
You can pay for a month of a developer of Spider,
or a developer of Numpy or developer of SciPy,
you can't tell them what you want them to do.
You can give them your priorities and things you wish existed.
And they'll work on those priorities
with the community to get what the community wants,
and what emerges with the community wants.
Is there some aspect on the consulting side
that is helping, as we were talking about morphology and so on?
Is there specific application that are particularly driving,
sort of inspiring the need for updates to sign.
And absolutely, GPUs are absolutely one of them.
A new hardware beyond GPUs.
I mean, Tesla's dojo chip, I'm hoping we'll have a chance
to work on that perhaps.
Things like that are definitely driving it.
The other thing is driving is scalable,
like speed and scale.
How do I write numpy code or numpy- like code if I wanted to run across a cluster?
You know, that's DASC or maybe it's Ray.
I mean, there's sort of ways to do that now
or there's MOTIN and there's,
so pandas code, numpy code, scipy code,
second learn code, then I want to scale.
So that's one big area.
Have you gotten the chance to chat with Andre and Elon
about because like,
No, I would love to by the way.
I'm not very loved to.
I just saw their Tesla AIDs video.
Super excited.
So this one of the, you know, I love great engineering software and
engineering teams and engineering teams in general.
And they're doing a lot of incredible stuff with Python.
They're like, they are revolutionary.
So many aspects of the machine learning pipeline,
agree that's operating in the real world and so much
of that is Python. And like you said, the guy running, you know, Andre Kapati running autopilot
is tweeting about optimization of non-privances.
Oh, love to talk to them. In fact, we have a quonsite. We've been fortunate enough to work with
Facebook on PyTorch directly. So about 13 developers at Quonsight, some of them are in labs working directly on PyTorch.
On PyTorch.
On PyTorch, right?
So basically we started Quonsight.
I went to both TensorFlow and PyTorch.
I said, hey, I want to help connect what you're doing
to the broader SciPy ecosystem.
Because I see what you're doing.
We have this bigger mission.
We want to make sure we don't lose energy here.
So, and Facebook responded really positively
and I didn't get the same reaction.
Not yet.
Not yet.
I love the folks' tester.
I really love the folks' TensorFlow too.
They're fantastic.
I think it's just how it integrates with their business.
I mean, like I said, there's a lot of reasons.
Just the timing, the integration with their business,
what they're looking for.
They're probably looking for more users,
and I was looking to kind of kept some development effort,
and they couldn't receive that as easily, I think.
So I'm hoping, I'm really hopeful,
and love the people there.
What's the idea behind Open Teams?
So Open Teams, super excited about Open Teams,
because it's one of the,
I mentioned my idea for investing directly in open source.
So that's a concept called Pharaoh SS.
But one of the things when we started
a quonsight we knew we would do is we developed products and ideas and new companies might come out.
And in Aconda this was clear. Right? And Aconda, we did so much innovation that like five or six
companies could have come out of that. And we just didn't structure it so they could. But in fact,
they have, you look at bass, there's two companies going out of the ask, you know, bokeh could be
a company.
There's lots of companies that could exist
off the work we did there.
And so I thought, oh, here's a recipe for an incubation,
a concept that we could actually spawn new companies
and new innovations.
And then the idea has always been, well,
money they earn should come back to fund
the open source project.
So labs, I think there should be a lot of things
like Quonsite Labs. I think this concept is one that scales. You could have a lot of open source
research labs along the way. So in 2018, when the bigger idea came how to make open source
and investable, I said, oh, I need to write it. I need a creative venture fund. So we created
a venture fund called Quonsite Initiate at the same time. It's an angel fund. Really it's, you know,
we start to learn that process. How do we actually do this? How do we get LPs? How do we actually
go in this direction and build a fund? And I'm like every venture fund should
have an associate open source research lab, which is no reason. Like our venture fund, the carried
interest portion of it goes to the lab. It directly will fund the lab. That's fascinating,
by the way. So you use the power of the organic formation of teams in the open source community,
and then like naturally that leads to a business that can make up.
There are some, yeah, great.
And it always maintains and loops back to the open source.
Loose back to open source.
Exactly.
I mean, to me, it's a natural thing.
There's absolutely a repeatable pattern there.
And it's also beneficial because, oh, I have,
I have natural connections to the open source.
I have an open source research lab.
Like, they'll all be out there talking to people.
And so we've had a chance to talk to a lot of early stage
companies and our fund focuses on the early stage.
So Quonsite has the services, the lab, the fund.
In that process, a lot of stuff started to happen.
They were like, oh, we started to do recruiting
and support and training.
And I was starting to build a bigger sales team
and marketing team and people besides just developers.
And one of the challenges with that
is you end up with different cultural aspects.
Developers, in any company you go to,
you kind of go look, is this a business led company,
a developer led company, do they kind of coexist,
or the, let's see interface between them,
there's always a bit of attention there,
like we were talking about before.
What is the attention there? With open teams talking about before, what is the tension there?
With Open Teams, I thought, wait a minute,
we can actually just create this concept of
Qanseh plus Labs.
It's well, it's specific to the PiData ecosystem.
The concept is general for all open source.
So Open Teams emerges that, oh, we can create
a business development company for many, many Qansehs,
like thousands of Qansehs,
and it can be a marketplace to connect,
essentially be the enterprise software company of the future.
If you look at what enterprise software wants
from the customer side, and during this journey,
I've had the chance to work and sell to lots of companies,
Exxon and Shell and Davey Morgan, big of America,
like the Fortune 100, and talk to a lot of people
and procure them and see, what are they buying?
And why are they buying? So, you know, I don't know everything, but I've learned a lot about, oh, what are they really looking for?
And they're looking for solutions. They're constantly given products from the from enterprise software.
Here's open source lead enterprise software now I buy it and then they have to stitch it together into a solution.
Open source is fantastic for gluing those solutions together. So
whereas the keep meeting new platforms are trying to buy, what most enterprises want
is tools that they can customize that are as inexpensive as they can.
Yeah, and so you always want to maintain the connection to the open source because that's
something that's been made available.
Yes, so open teams about solving enterprise software problems.
Brilliant, brilliant idea, by the way.
With a connect, but we do it, honoring the topology,
we don't hire all the people.
We are a network connecting the sales energy
and the procurement energy.
And we were on the business side, get the deals closed,
and then have a network of partners like Quonsite
and others who we hand the deals to, right?
To actually do the work.
And then we have to maintain, I feel like we have to maintain
some level of quality control.
So the client can rely on open teams to ensure the delivery.
It's not just, here's a lead, go figure that out.
But no, we're going to make sure you get what you need.
Right.
By the way, it's such a skill and I don't know if I have the patience.
I will have the patience to talk to the business people or more specific, I mean, there's all
kinds of flavors of business people or like marketing people.
There's a challenge.
I can't hear what you're saying because I've had the same challenge.
Yeah.
And it is true.
There's sometimes you think, okay, this is way overwrought.
Yeah.
You have to become an adult, you have to because the companies have needs.
They have the way to make money.
And they also want to learn and grow.
And it's your job to kind of educate them on the best way.
Like, the value of open source, for example.
Right.
And I'm really grateful for all my experiences
over the past 14 years, understanding that side of it
and still learning, for sure.
But not just understanding from companies,
but also dealing with marketing professionals
and sales professionals and people that make a career
out of that understanding what they're thinking about
and also understanding, well, let's make this better.
Like we can really make a place.
Like open teams I, is the transmission layer
between companies and open source communities
producing enterprise software solutions.
Eventually, we want to, today, we're taking on SAS
and MATLAB and tools that we know we can replace for folks.
Really, anytime you have a software tool at organization,
where you have to do a lot of customization
or make it work for you.
Because now you're just buying this thing off the shelf
and it works.
It's like, OK, you buy this system, and then you customize it a lot of customization or make it work for you. Because I can just buy in this thing off the shelf and it works. It's like, okay, you buy the system,
and you customize a lot, usually with expensive consultants,
to actually make it work for you.
All of those should be replaced by open source foundations.
With the same customers.
You do an amazing, such important work, such important work,
and these giant organizations that do exactly that,
taking some proprietary software
and hiring a huge team of consultants that customize it
and then that whole thing gets outdated quick.
Correct.
I mean, that's brilliant.
Right.
The one solution to that is how it would like,
kind of what Tesla is doing a little bit of,
which is basically build up a software engineering team.
Yeah.
Like build a team from scratch.
Build a team from scratch.
And companies are doing it well. That's what they're doing right now.
Yeah, right.
That's okay.
And you're creating a apology for some of that.
You just don't have to do it.
That's something the only answer.
Right.
And so other companies can access this.
It's more accessible.
We literally say open teams is the future of enterprise software.
It's we're still early.
Like this idea just percolated over the past year as we've kind of grown coincident
and realized the extensibility of it. We just finished in our seed round that
were to help get more sales people and then push the messaging correctly and
there's lots of tools we're building to make this easier like we want to
automate the processes. We feel like a lot of the power is the efficiency of the
sales process. There's a lot of wasted energy in small teams and the sales
energy to get into large companies and make a deal. There's a lot of wasted energy in small teams and the sales energy to get into large companies
and make a deal.
There's a lot of money spent on that.
So, it's preparing the tools and processes for that sales.
So, make that super seamless.
So, a single company can go, oh, I've got my contract with Open Teams, we've got a subscription
they can get.
They can make that procurement seamless.
And then the fact they have access to the entire open source ecosystem.
And we have a part of our work that's embracing open source ecosystems
and making sure we're doing things useful for them, we're serving them.
And then companies making sure they're getting solutions that care about.
And then figuring out which, which targets we have, you know, we're not taking on all of open
source, all of enterprise software yet, but we're, we're, well, this feels like the future.
The idea and the vision is brilliant. enterprise software yet, but we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we're, we? It's been, you know, ballboard and necessarily make a good meme about how we approach that, but names are broadening that. I think that's
why, because gay recognized GitHub is where developers are at, right? And so-
But do they have a vision like open teams type of situation, right?
I think so yet. I mean, are they just basically throwing money at developers to show their support?
I think so. It's about a topology, like you put it,
like a way to leverage that,
like to give developers actual money.
Right.
I don't think so.
I think there's still an enterprise software company
and they make a bunch of money.
They make a bunch of games.
They have a big company, they sell products.
I think part of it is they know there's opportunity
to make money from GitHub.
Right, there's definitely a business there,
you know, to sell to developers or to sell to people using development. I think there's part to make money from GitHub. There's definitely a business there to sell to developers
or to sell to people using development.
I think there's part of that.
I think part of it is also they had definitely wanted
to recognize that you need to value open source
to get great developers, which is an important concept
that was emerging over the past 10 years
that pay by data, we were able to convince J.F.Y. Morgan
to support
pay data because of that fact, right?
That was worth the money for them putting a couple hundred
thousand into supporting pay data for several conferences
was they want developers and they realize that developers
want to participate in open source.
So enterprise software folks don't always understand
how their software gets used.
Having spent a lot of time on the floors at JPMorgan
at in Shell, at ExxonMobil, you see,
oh,
these companies have large development teams.
And then you're kind of dealing with the,
what's being delivered to them.
So I really feel kind of a privilege
that I had a chance to learn some of these people
and see what they're doing.
And even work alongside them, as a consultant,
using open source and trying to make this work
inside of our large organization.
Some of it is actually for a larger organization,
some of it is messaging to the world
that you care about developers and you're the cool,
you care, like for example, like it forward
because I talked to them, like car companies, right?
They want to attract, you know,
you wanna take on Tesla and autopilot,
you wanna take that, right?
And so what do you do there?
You show that you're cool.
Like you, you try to show off that you care about developers and they have
a lot of trouble doing that.
And like one way, I think like forward should a bug get hub.
There's just a show off.
Yeah, like these old school coming in and it's in a lot of different industries.
There's probably different ways.
It's probably an art show that you care to developers and they developers.
It's it's exactly what you like.
For example, just spitballing here, but like for it or somebody like that could
give a hundred million dollars to the development of Numpi and like,
like literally look at like the top most popular
projects in Python and just say, which is going to give money.
Right.
Like that's going to immediately make you cool.
They could actually.
Yeah.
And they set up numfocus to make it easy.
Yeah.
But the challenge was is also you have to have some business development.
Like it's a bit of a it's a bit of a seeding problem, right?
And you look at how, I've talked to the folks at Linux Foundation, I know how they're
doing it, I know how, and starting NUMPFocus, because we had two babies in 2012.
One was Anaconda, one was NUMPFocus, right?
And they were both important efforts.
They had distinct journeys and super grateful that both existed and still grateful both
exist.
But there's different energies in getting donations as there is getting, this is important
to my business.
Like I'm selling you something that this is that I'm going to make money this way.
If you can tie the message to an ROI for the company, it's much more effective, right?
So and there are rational arguments to make.
I've tried to have conversations with the marketing, especially in marketing departments,
like very early on, it was clear to me that, oh, you could just take a fraction of your marketing
budget and just spend it on open source development, and you get better results from your marketing.
Like, because-
How do those, can I, sorry, I'm going to try not to go out of my hands here.
What have you learned from the interaction with the marketing folks on that kind of,
because you gave a great example of something
that will obviously be much better investment
in terms of marketing, is supporting open source projects.
The challenge is not dissimilar
from the challenge you have in academia,
of the different colleges, right?
Knowledge gets very specific and very channeled, right?
And so people get, they
get a lot of learning in the thing they know about. And it's hard then to bridge that and
to get them to think differently enough to have a sense that you might have something
to offer. Because it's different. It's like, well, how do I implement that? How do I,
what do I do with that? Like, do I, which budget do I take from? Do I slow down my spend
on Google ads or my
spend on Facebook ads? Or do I not hire a content creator? There's an operational aspect
to that that you have to be the CMO or the CEO. You have to get the right level.
So you have to hire the high position level. Are they care about this in this?
Right. Or they won't know how. Right. because you can also do it very clumsily, right?
And I've seen, because you can,
you obviously have to honor and recognize
the people you're going to and the fact that
if you just throw money at them,
it could actually create more problems.
Can I just say, this is not you saying?
Can I just, because I just need,
I need to say this, I've been very surprised
how often marketing people are terrible at marketing. I feel like the best marketing
is doing something novel and unique that anticipates the future. It feels like so much of the
marketing practice is like what they took in school or maybe they're studying for what was the
best thing that was done in the past decade and they're just repeating that over and over as opposed to innovating, like taking the risk to me,
marketing is taking the big risk.
That's a great point.
And being the first one to risk.
Yeah.
There's an aspect of data observation from that risk, right?
That's, that's, that's, you know, I think, could share what they're doing already, but
it absolutely, it's, it's about, I think it's content.
Like, there's this whole world on content marketing
that you can almost say, well, yeah, it can get over,
you can get inundated with stuff that's not relevant to you.
Whereas what you're saying would be highly relevant,
highly useful, and highly beneficial.
Yeah, but it's risk, I mean, that's why there's a lot
of innovative ways of doing that.
Towsas and examples of people that basically don't do marketing.
They do marketing in a very, like, I think Elon hired a person
who's just good at Twitter for running Tesla's Twitter account.
I mean, that's exactly what you want to be doing.
You want to be constantly innovating in the.
Right, there's an aspect of telling, I mean, I've definitely
seen people doing great work where you're not talking
about it.
Like I would say that's actually a problem I have right now
with Quonsight Labs. Quonsight Labs has been doing where you're not talking about it. I would say that's actually a problem I have right now with Quonsight Labs.
Quonsight Labs has been doing amazing work.
Really excited about it.
We haven't been talking about it enough.
We haven't been.
There's different ways to talk about it, there's different ways to, there's different channels
to which to communicate.
There's also, like, I'll just throw some shade at companies I love.
So for example, I robot, I just had a conversation with them, they make
Rumba's. Sure. And I think I love the incredible robots. But like every time they do convert,
like advertisement, not advertisement, but like marketing type stuff, it just looks so corporate.
And to me, the incredible, maybe wrong in the of Iroba, I don't know.
But to me, when you're talking about engineering systems, it's really nice to show off the
magic of the engineering and the software and all the geniuses behind this product and
the tinkering and the raw authenticity of what it takes to build that system versus
the marketing people who want to have like pretty people like standing there all pretty with the robots like moving
perfectly.
So to me, there's some aspect is like speaking to the hackers, you have to throw some bones,
some care towards the engineers, the developers, because there's some aspect one for the hiring,
but two, there's
an authenticity to that kind of communication that's really inspiring to the end user as
well.
Like, if they know that brilliant people, the best in the world are working at your company,
they start to believe that that product that you're creating is really interesting.
Because your initial reaction would be, wait, there's different users here.
Why would you do that?
You know, my wife bought a rum butt, rum butt.
She loves developers, loves me, but she doesn't care about that culture.
So as soon as you said, it's actually the authenticity, because everyone has a friend
or when those people, there's word of mouth.
I mean, if you...
Word of mouth is so, so...
Yeah, exactly.
And then...
Because I think it's the lack of that realization, there's this halo effect.
Right.
And also it influences your general marketing.
I interesting for some stupid reason.
I do have a platform and it seems that the reason
I have a platform many others like me,
millions of others is like the authenticity
and like we get excited naturally about stuff.
And like I don't want to get excited about that
I robot video because it's boring,
it's marketing, it's corporate,
as opposed to, I wanted to do some fun, this is me, like a shout out to iroba, is they're
not letting me get into the robot.
Yeah, well, there's an aspect of, it could be benefiting from a culture of modularity,
like addons and, that could actually dramatically help.
If you've seen that over history, I, Apple is an example of a company like that
or the, I can see what your point is,
is that you have something that needs to be,
it needs to be adopted broadly,
the concept needs to be adopted broadly.
And if you wanna go beyond this one device,
you need to engage the community.
Yeah, and connecting to the open source,
that you said, I gotta ask you,
your programmer,
one of the most impactful programmers ever,
you've led many programmers, you lead many programmers,
what are some from a programmer perspective,
what makes a good programmer,
what makes a productive programmer,
is there a device you can give to be a great programmer?
That's great, great question.
There are times in my life I'd probably answer this even better than I hope maybe give
an answer today.
I thought about this numerous times.
Right now I've spent on so much time recently hiring sales people.
Your mind is a little bit on something else.
On something else.
But I reflect on the past and also, the only way I can do this, I have some really great
programmers that I work with who lead the teams that they lead.
And Michael has inspired them and hopefully helped them, encouraged them and be, helped them
encourage with their teams. I would say there's a number of things, a couple of things. One is
curiosity. Like, I think a programmer without curiosity is mundane. Like, you'll lose interest, you won't do your best work.
So it's sort of an affect.
It's sort of, you can have some curiosity about things.
I think, too, don't try to do everything at once.
Recognize that you're, you know, we're limited as humans.
You're limited as a human.
And each one of us are limited in different ways.
You know, we all have our different strengths and skills.
So it's adapting the art of programming to your skills.
One of the things that always works is to limit what you're trying to solve.
Right? So, if you're part of a team, usually maybe somebody else has put the architecture together
and they've gotten given a portion for you if you're young.
If you're not part of a team, it's sort of breaking down the problem into smaller parts,
is essential for you to make progress.
It's very easy to take on a big project and try to do it all at once and you get lost.
And then you do it badly.
So, thinking about, you know, very concretely what you're doing, defining the inputs and outputs,
defining what you want to get done.
Even just talking about that, like writing down, before you write code, just what are you
trying to accomplish?
I mean, very specific about it really, really helps. I think using other people's work,
don't be afraid that somehow you're like, you should do it all. Like, nobody does.
Stan in the show is a giant.
A copy and paste from stuck.
A copy and paste from stuck.
But don't just copy and paste,
it's a particular relevant in the era of codex
and the auto-generated code,
which is essentially, I see as an index
in a stack overflow.
Right, exactly.
It's like search engine.
It's a search engine over stack overflow, basically.
So it's not, I mean, we've had this for a while.
But really, you want to cut and paste, but not blindly.
Like, absolutely, I've cut and based to understand,
but then you understand, oh, this is what this means,
oh, this is what it's doing.
And as much as you can, so it's critical
that's where the curiosity comes in.
If you're just blindly cutting and basing,
you're not gonna understand.
And so understand, and then you know,
be sensitive to hype cycles.
Right, every few often there's always a,
oh, test driven development is the answer,
oh, object oriented is the answer,
oh, there's always an answer.
Adjol is the answer.
Be cautious of jumping onto a hype cycle.
Like, likely there's signal,
like there's a thing there
that's actually valuable you can learn from,
but it's almost certainly not the answer
to everything you need.
Well, let's just draw from you having created NumPy and SciPy?
Like, in service of sort of answering the question of what it takes to be a great programmer
and giving advice to people. How can you be the next person to create a SciPy?
Yeah, so one is listen to who?
To people have a problem, right? Which is everybody, right? But listen and listen to people have a problem,
which is everybody, right?
But listen and listen to many and try to then do,
you're gonna have to do an experiment,
do fall down, don't be afraid to fall down.
Don't be afraid, the first thing you do
is probably gonna suck and that's okay, right?
It's honestly, I think iteration is the key innovation.
And it's almost that psychological
hesitation we have to just iterate.
Like, yeah, we know it's not great,
but next one will be better.
I mean, just keep learning and keep proving,
and keep improving.
So it's an attitude.
And then it doesn't take intense concentration.
Right?
Good things don't happen.
It's not quite like TikTok or Facebook.
You can't scroll your way to good programming.
There are sincere hours of deep, don't be afraid of the deep problem.
Often people will run away from something because I can't solve this.
You might be right, but give it an hour, give it a couple of hours and
see. And, you know, just five minutes, not going to give you that.
Was it lonely when you were building Saipai and Nampa?
Cugely. Yeah. Absolutely lonely in the sense of you had to have an inner drive. And
that inner drive for me always comes from, I have to see that this is right.
In some angle, I have to believe it, that this is the right approach, the right thing to
do.
With SciPy, it was like, oh, yeah, the world needs libraries and Python.
Clearly, Python's popular enough with enough influential people to start and it needs more
libraries.
So that is a good in itself.
I'm going to go do that good.
So find a good, find a thing that you know is good and just work on it.
So that has to happen.
And it is.
And you have to have enough realization of your mission to be okay with the naysayer or
the fact that not every joins you at front.
In fact, one thing I've talked to people a lot, I've seen a lot of projects come in some
fail.
Not everything I've done has actually worked perfectly.
I've tried a bunch of stuff that, okay, that didn't really work or this isn't working and why.
But you see the patterns and one of the key things is you can't even know for
six months. I say I'll 18 months right now. If you're just starting a new project, you've got to give it a good 18 month run before you even know if the feedback's there.
Like it's you're not gonna know in six months. You might have the perfect thing, but six months from now, it's still kind of still emerging.
So give it time,
because you're dealing with humans
and humans have an inertia energy
that just doesn't change that quickly.
So.
Let me ask a silly question,
but like you said,
you're focused on the sales side of things currently,
but back when you were actively programming
maybe in the 90s, you talked
about IDs. What's your setup that you have that brings you joy, keyboard, number of screens,
Linux?
I do still like to program some. I just don't know what I used to. I have two projects.
I'm super interested in trying to find funding for them, trying to figure out some good
teams for them, but I could talk about those. But what, yeah, what, I'm an E-max guy.
Right.
Thank you.
The superior editor, everybody, I've got, I've, I don't often delete tweets, but one of
the tweets I deleted when I said E-max was better than them.
And then the hate I got for you.
It is.
I was like, I'm walking away from this.
I do too.
I, I don't push it.
I mean, I'm, I'm just joking, of course.
Yeah, exactly. It's kind of like, but people do take the editor seriously, right?
They take it. I did it as a show.
What's your life?
It is, but there's something.
There's something beautiful to me about UMX, but there's for people that love
them, there's something beautiful to them about that.
I mean, I do use them for quick editing.
Like, man, like if I send quick editing,
I will still sometimes use it, but not much. Like, it's simple, corrective, corrective single editor character.
So when you were developing SciPy, you were using Emax.
Yeah. SciPy and NumPy are already in the Max on Linux box, and CVS, and then SVN,
version control. Git came later. Like, Git has, I love distributed branch stuff. I think Git is pretty complicated,
but I love the concept.
And also, of course, GitHub is,
and then GitLab may get definitely consumable.
But that came later.
Did you ever touch the list,
but all the, like, what were you?
What were the emotional feelings about all the parentheses?
Great question.
So I find myself appreciating lists today
much more than I did early,
because when I came to programming I knew programming
But I was a domain expert, right? And to me the parentheses were in the way
It's like wow, I just all this like it's just gets in the way of my thinking about what I'm doing
So why would I have all these right?
That was my initial reaction to it
You know, and now as I appreciate kind of the structure that kind of naturally maps to maps to the logical thinking about a program.
I can appreciate them, right? And why it's actually you could create editors that make it not so
problematic, right? Honestly. So I actually have a much more appreciation of lists,
and things like closure, and there's high V, which is a Python, you know, a list that compiles the
Python bytecode. I think it's challenging, typically these languages
are, I even saw a whole data science programming system
in Lisp that somebody created, which is cool.
But again, I think it's the lack of recognition
of the fact that there exists what I call occasional programmers.
People are never going to be programmers for living.
They don't want to have all this cuteness in their head.
They want just, it's why basic, Microsoft had the right idea with basic in terms
of having that be the language of visual basic, the language of Excel and SQL Server.
They should have converted that to Python 10 years ago, but it would be a better place
that they had.
But there's also, there's a beauty and a magic to the history behind a language
and list, you know, some of the most interesting people in the history of computer science and
artificial intelligence have used the list. So yes, you feel, well, especially that language,
when you have a language, you can think in it. Yeah, and it helps you think about it. And it tracks
a certain kinds of people that think a certain kind of way. And then that's there. Okay, so what about like small laptop
with a tiny keyboard or is there like,
the screen, you know, good question.
I've never gotten into the big, the many screens,
to be honest.
I mean, maybe it's because in my head,
I kind of just, I just squat between windows.
Like, partly because I guess I,
I really can't process three screens at once anyway.
Like, I just am looking at one and I just flip,
you know, I flip an application open.
So, what about, where it's really helpful
is actually when I'm trying to, you know,
here's data and I want to input it from here.
Like, this is the only time I really need another screen.
So now, because you're both a developer,
lead developers, but then there's also these businesses
and their sales people in York,
working with large companies.
operations people hiring people.
The whole thing.
Which operating system is your favorite, though, at this point?
So Linux was the early.
Yeah, yeah, I love Linux as a server side.
And it was early days I was at my own Linux desktop.
I've been on Mac laptops for 10 years now.
Yeah, this is what leadership looks like. Which to Mac. Okay, great. Pretty much. I mean, just the fact that I had to do
PowerPoints, I had to do presentations and you know, plug in, I just couldn't
mess with plug in in laptops, it wouldn't project and so you mentioned also
Quonset Labs and things like that. Can you give advice on how to hire great programmer and great people?
Yeah, I would say produce an open source project.
Get people contributing to it and hire those people.
Yeah.
I mean, you're doing it sort of, you might be perhaps a little
wise, but that's probably 100% really good advice.
I find it hard to hire. I still find it hard to hire. Like, in terms of, I don't think that
it's not hard to hire if I've worked with somebody for a couple of weeks, but a couple an hour or two
of interviews, I have no idea. So that instinct, that radar of knowing if you're good or not,
that you found that you're still not able
to really.
It's really hard.
I mean, the resume can help, but again, the resume is like a presentation of the things
they want you to see, not the reality of, and there's also, you know, you have to understand
what you're hiring for.
There are different stages and different kinds of skills, and so it isn't just one of the
things I talk a lot about
internally at my company is that the whole idea of measuring
ourselves against a single axis is flawed.
Because it's a multidimensional space.
And how do you order a multimensional space?
There isn't one ordering.
So this whole idea, you immediately
have projected into a thing when you're
talking about hiring or best or worst or better or not better
so what is the thing you're actually needing and you can not even hire for that?
There is such a thing generally I really value people who have the affect
the care about open source like so in some cases their their thinnig open source is
simply kind of a filter of an affect
However, I have found this interesting dichotomy
between open source contributors and product creation.
I don't know if it's fully true,
but there does seem to be the more experienced,
the more affect somebody has in open source community,
the less ability to actually produce product that they have.
But, and the other one kind of true too.
The more product focused are,
I find a lot of people, I've talked to a lot of people
who produce really great products.
And they have a, they're looking over
the open source community,
is kind of wanting to participate and play,
but they've played here.
And they do a great job here,
and then they don't necessarily have some of the same,
and I don't think that's entirely necessary.
I think part of it is cultural, how they've emerged.
Because one of the things the open source community is often lack is great product management,
like some product management energy.
That's brilliant.
But you want both of those energies in a same place together.
Yes, you really do.
And so it's a lot of it's creating these teams of people that have these needed skills and
attributes that are
hard.
One of the big things I look for is somebody that fundamentally recognize their need to
learn.
One of the values that we have and all of the things we do is learning.
If somebody thinks they know it all, they're going to struggle.
Some of that is just there's more basic things like humility just
being humble in the face of all the things you don't know and that's like step one
of learning. The step one of learning right and you know I've spent a lot of
time learning right other people spend a lot more time but I spent a lot of
time learning I went you know my whole goal was to get a PhD because I love school
and I wanted to be a scientist and then what I found is what's been written about
elsewhere as well as the more I learned,
the more I didn't know, the more I realized, man,
I know about this, but this is such a tiny thing
in the global scope of what I might wanna know about.
So I need to be listening a whole lot better
than I am just talking.
That's changed a little bit actually.
My wife says that I used to be a better listener.
Now that I have, I'm so full of all these ideas I want to do, she kind of says, you've got to
leave people time to talk. So you've succeeded on multiple dimensions. So one of the 10-year
track faculty, the others just creating all these products, then building up the businesses,
then working with businesses. Do you have advice for young people today in high school and college of how to
live a life as non-linear and successful as yours? A life that could be they could be proud of.
Well, that's a super compliment. I'm humbled by that actually. I would say
a life that can be proud of, honestly, one thing that I've said to people is, first,
find people you love and care about them.
Like, family matters to me a lot.
And family means people you love and have committed to.
Right, so it's, can be whatever you mean by that, but you need to have a foundation.
So find people you love and want to commit to and do that.
Because it anchors you in a way that nothing else can.
And then you find other things.
And then from out there you find other kinds of things
you can commit to, whether it's ideas or people
or groups of people.
So especially in high school, I would say
don't settle on what you think you know. Like, give yourself 10 years to think about the world.
Like, there's, I see a lot of high school students who seem to know everything already.
I think I did too.
I think it's maybe natural.
But recognize that the things you care about, you might change your perspective over time.
I certainly have, over time, I've really passionate about one specific thing
and I was kind of softened. You know, I was a big, I didn't like the Federal Reserve, right?
And there's still, we can have a lot of conversation about monetary policy finances, but, but
I'm a little more nuanced in my perspective at this point. But, you know, that's one area
where you learn about something, go, I want to attack it. Build, don't destroy. Build, so often the tendency is to not like something
they want to go attack it. Build something. Build something to replace it. Build up, you know,
attract people to your new thing. You'll get far more, far better, right? You don't need to destroy something to build something else.
So that's, I guess, generally.
And then, you know, definitely, like curiosity, you know, follow your curiosity and let it,
don't just follow the money.
And all of that, like you said, is grounded in family in family friendship and ultimately love.
Yes.
Which is a great way to end a Travis.
You're one of the most impactful people in the engineer and the computer science in the
human world.
So I truly appreciate everything you've done.
And I really appreciate that you would spend your valuable time with me.
It was an honor.
It was a real pleasure for me.
I appreciate that.
Thanks for listening to this conversation with Travis, Allefant. would spend your valuable time with me. It was an honor. It was a real pleasure for me. I appreciate that.
Thanks for listening to this conversation with Travis
Olephant.
To support this podcast, please check out our sponsors
in the description.
And now, let me leave you with something
that in the programming world is called Hutchinson's Law.
Every sufficiently advanced list application
will eventually be re-implemented in Python.
Thank you for listening and hope to see you next time.
Thank you.