I’m (not) looking for a job
TL;DR
This is a very personal post where I’m articulating some thoughts about my career so far, my immediate shift of direction, and how you can help me. I’m typically not a disclaimer user because I believe it’s pretty obvious I’m speaking in my own name, but let me for once express that I have a lot of respect for the companies I worked for, and that I take full responsability for any flaws I’m about to discuss here.
My contract with ThinkR ended on feb 23rd, and I’m now repurposing my time to focus exclusively on open source work. I’ll discuss that in detail below, but for now, here’s a link to my patreon page where you have an opportunity to help me stay free and work serenely on cool open source stuff.
Once upon a time
I was a bright kid, and everybody in the family was surprised when I picked up reading on my own (that’s at least the story I’ve heard many times). After some tests I skipped two classes and things still were easy. I’m not writing this to brag, quite the opposite. I attribute most of my character traits and flaws to essentially surviving through school being two years younger than everybody else.
I did not really pay too much attention in school because things were easy, I just knew how to do the bare minimum. At the end of high school I knew I would pass because I was solid in maths, so I did not bother doing the second page of the physics exam. Things like that.
I kind of went with the flow of what I liked and knew how to do, so I ended up doing maths in grad school in Aix en Provence and Montpellier.
But there was only so much abstract maths I could cope with, so after failing one year while discovering stand up comedy, I was guided towards stats and this changed everything. I might not be able to stretch that metaphor too far, but 🎤 and stats are two sides of the same coin for me. In both you collect large amount of data, and you translate it to something interesting, sometimes it’s 🤔, sometimes it’s 🤣, same thing. Too bad there’s no comedyverse
.
The R magnet
I first learned about R in Montpellier for a homework assignment about PCA with the state.x77
dataset. The assignment was in S+, which was installed in the university machines, but the professor handed us what felt like a contraband 💿 of something that looked like S+. This was R one point something.
Later that year we went to CIRAD with a few classmates to download a more recent copy of R and most of CRAN. In particular we wanted the akima 📦 for a spatial statistics assignment.
I never really took the time to think about it, but I guess these two assignments are what got me interested in R in the first place. This is when I started to check what’s inside, digest lots of documentation and code, and register to the mailing list. R was like a big magnet I was attracted to, and even though the community was perhaps not as welcoming as it has become now, I wanted to be a part of it. The more I knew, the more I wanted to know … I was in my early 20s, and this was the first time I worked hard for something, no matter what the difficulty was. I developed this passion about R, and open source in general, but mostly R.
I finished my curriculum at the ISUP, but I still feel that the university in Montpellier already equipped me with the tools I needed. ISUP is one of the oldest schools of statistics, and it did carry a lot of baggage at the time. I remember being furious about the absence of R in the program, to the point that I felt I had to teach it to my classmates, so I organized like an underground training session in the computer room. Everybody chipped in and they offered me two tickets to go see a play in a nice theater in Paris.
Whenever there was an assignment in SAS or whatever, I would essentially discard that piece of information and do it in R instead. R was going to be a huge part of my life.
Also, the curriculum was very geared towards theoretical statistics, lots of ink and paper, e.g. I thought I knew what bootstrap was (for loops and resampling and cool graphics) so I was not ready for that deluge of theory around it. In a way, I guess I was already choosing data science.
I’ve heard things are better now, but I’m not sure what to believe about the airquotes around data science in the front page.
For my internship at the end of ISUP, I worked with inria on improving mixmod. This is how I learned C++. This was not really a stats internship, this did not really involve R, this was more like a computer science internship. In the context of ISUP, I was setting myself up for failure by going through with this internship. One of the professors took it way too personally and rejected my defense of this internship the first time. Everybody thought I was joking when I announced that I failed, that did not really help with the pain. I did it again a few weeks later with less lines of code in the slides and more lines of maths, and I passed with flying colors.
Not a Doctor
After that, amputed of some self confidence and without giving it much thought, and also being busy preparing for the arrival of my first 👶, I accepted to carry on with the same team and embarked no questions asked on the way to a thesis on a different topic. This was a terrible plan. Not being able to say no is generally a very bad idea. I don’t think I should be a source of wisdom, but here we go:
Learn to say no. Use that skill often.
The 10 months this lasted were very hard on me, I felt stuck between the university and the industry partner, constantly going from one office to another and acting as a messenger. I don’t think I did anything of any value during these few months.
Think about what you like, and do that.
My daughter was born and I was getting a sense of accomplishment through the R graph gallery, a website I put together to showcase graphics produced by R alongside the code. This was a lot of very fragile php code on top of a mysql database and poor web design, but people liked it. This would have been sooooo much nicer if blogdown existed at the time.
The deployment was such a pain too, I had to enter things for new graphics in a mysql database, and update the site through ftp. It lived in a real server to run the php code, and at some point during one my my disappearing episodes, I forgot to renew it so the last version essentially vanished. Life is so much sweeter now with things like github and netlify.
I might be able to find it again, but Yan Holtz picked the project and he is now maintaining a version on top of wordpress.
Anyway, through some 🎩 🐇, I was attending useR! 2006 in Vienna with a poster about my work on mixmod. Life changing experience, useR was much smaller at the time but still this was quite intimidating, and you already had the feeling of being bombarded with information, new things to learn.
Consulting
Overall, the conference was reinforcing my desire to be part of the community. Some people knew about the graph gallery, so I talked more about it than about my poster. This is also where I met with people from Mango. I was like wait, I can write R code all day and be paid for it, that sounds cool. That sounded better than what I was doing. I didn’t think about it twice.
A few months ⏩, I was in a pub in Chippenham doing an interview. We moved to Bath with a 6 months old 👶, and apparently I was a consultant now.
This was my first real job, I travelled to many places, attended conferences, and learned a lot of things. Most of what I know about being a consultant comes from the two years at this job, most of the struggles too. Not too much open source, although I was quite extreme about using open source as much as possible. Dumping windows to install some funky (or maybe even not that funky) linux distribution was a good way in 2006 to create some friction with people and software.
It only took me 2 years to develop enough anxiety to break down, and again that is totally on me. Mango was a small (6 when I joined, twice that when I left) company, people were nice and friendly, we went to the pub frequently and played Badminton, they were super helpful for the whole expat with a baby process.
What triggered the break down was that I had to organize an Advanced R Training Course. I did not know what it meant, I still don’t really know, but in my head it meant something I’m not good enough to deliver, it meant people are going to ask me questions I won’t have answers for, it meant I was going to make a fool of myself. And learning more stuff for such an occasion might not be a good plan, because essentially the more you learn, the more you realize how much more you need to learn, and this is daunting in the process of teaching advanced professionnal expert level (add yours) training. Don’t get me wrong though, learning more stuff is generally a good idea, but not really when you do it as part of preparing one of the millions things you think you have to master tomorrow.
I really enjoy when I’m a room with people transmitting things that I know about R, but at the same time preparing the materials is something I always struggle with, always, no matter the topic, no matter how much I know about it. Knowing something and be able to teach it are two very different games.
I took some time off, but I did not really recover, and we kind of wanted our daughter to go to a french school. So we came back to France, and I was like Ok, so I’ll just do the same exact thing, consulting and training but as an independent freelancer, how hard could it be, right? Mango was super helpful in the transition process, they even gave me an extra gig. My former employer was my first client, that was nice.
I also had a few contacts, this materialised to few consulting contracts. And anyway, worst comes to worst I can always go hunt for a job. I was very naive to believe that I could pull it off. On top of not being a consultant (more on that below), dealing with paperwork, invoices, spending time on the phone, estimating the time something will take me, multiplying that by two, but still being way off, coming up with how much I think my time is worth, … I was not ready for any of that.
By some miracle, this lasted 8 years, and let’s just say it has not been the best financial journey. I was structureless, not really giving careful thought about the job. Not thinking is kind of a pattern of me, I just realized it now because of thinking about it, that’s refreshing.
Think
I’ve had some nice projects as a freelancer, organized a few training sessions, travelled to many places again, but for the most part, when you do consulting, not only do you work on projects you don’t choose, but on top of that you work on something someone else did not want to do. Sometimes you’re lucky and that’s a nice project, sometimes you might learn something from it, but most of the time you’d rather be doing something else.
Open source Asylum
For me, that was seeking asylum in open source projects, I already developed a taste for connecting R with other languages, that started with Java. My first trip to the land of the R internals was because of rJava. I liked the 📦 but the jni inspired low level interface was a bit hard to grasp, so I helped with the reflection based api with the J
function and the use of the $
method to access fields and methods.
I discovered Google’s protocol buffers and I liked it, that usually means I’m likely to do a package about it, and I could reuse things that I learned from working on rJava
. protobuf
has a C++ library, so it was very natural to use it to bring them to R. There was a thing called Rcpp
that existed at the time, so I started using it to build RProtoBuf
.
Rcpp was not what it is today: data was copied on the way in from R objects to C++ structures, like std::vector<double>
and on the way back from the C++ structures to R objects, the interface was weird, all the classes started with the Rcpp
prefix, … anyways there was a lot to be unhappy about.
Again, thanks to previous work on RJava
, I knew about R_PreserveObject
and R_ReleaseObject
, and starting to use them is one of the two best things that ever happened to Rcpp
(the other one being attributes from JJ ).
/* preserve objects across GCs */
void R_PreserveObject(SEXP);
void R_ReleaseObject(SEXP);
That changed everything, we could now have nice syntax, without paying for expensive data copies. When an Rcpp
object (e.g. a NumericVector
) is constructed, its underlying R object goes through R_PreserveObject
to protect it from the garbage collector, and when the object is destructed, a call is made to R_ReleaseObject
to remove that protection.
I ended up working almost full time on Rcpp for several months, and then users started to show up so we had to support them through mailing list and stack overflow and things like that. That’s a lot of work. I liked it but did not really pay attention to the fact that I was essentially doing all this work for free. For sure down the line, I organised a few training sessions about Rcpp
and I had some contracts around it, but I think it is fair to say that I never really got return on my time investment on the project. We did not really know that this was going to power many cran packages at the time.
Developing Rcpp
was for me a way to learn about R internals and C++ at the same time, while doing something that would be generally useful to others. I usually try to install some community side effects when I learn something.
And then this happened.
I believe answering that question from Hadley on stack overflow had a huge impact on a personal level for sure, but also most importantly on the community. 2010 was a pre-dplyr pre-magrittr-pipe R world, and plyr
was running the show. I never really got on the plyr
🚅 and catching Hadley’s attention about Rcpp
is one of the things that led to dplyr
.
Working on dplyr
is definitely the best thing that ever happened to me. There were clear goals, but a lot of freedom for the implementation, and since speed was high on the list, I naturally ended up using a lot (too much perhaps) of Rcpp
. And 🌸 on the 🍰, I could invoice my time.
I’m not a consultant
I still had private consulting projects, but then a weird pattern started to emerge. In short, I lost interest in the private consulting projects, so these projects would greatly slow me down and build anxiety. I still had to deliver them, and somehow that led to a time black 🕳️, all my time was going in there for almost no results, or a very slow 🐌 progress.
Then the feeling of guilt, why would I be allowed to work on cool (paid nor not paid) projects while I was feeling guilty of not being able to make progress on the others. I then ended up spending less and less time on the things that I liked because the things I didn’t were relentlessly eating all of my time.
Eventually the cool projects were too far, and I could not find the energy to get back at them, and the shame of having disappeared did not help me reaching out. All I would have had to offer was excuses, sorry I’ve been busy and all that. shame.
Being late at delivering also means no money, no stability etc … and feeding my imposter syndrome 👿.
The job offer from ThinkR felt like a blessing. I would have a salary, people to interract with, challenging work, and an 80/20 setup so that I could spend 20% of my time on whatever. It has been a nice adventure, the job is interesting and the company is ubiquitous in the 🇫🇷 R community, something I’ve never been able to do on my own.
The old demons were still present though. I can only blame myself for that, but that gluttoned my 20%. I’m not saying I’ve not been present and encouraged to be during my year, e.g. even though I did not get to go to useR! last year for 👶 reasons, I have been active on twitter, which triggered curiosity about emojis, planting the seed of future work on the emo
📦.
Community-me was trying to get out, that twitter activity led to the tweetstorm shiny app (I only realized later that the word meant something else), which we then used in shiny training courses. Same story with the prenoms package, the shiny app about the french legislatives elections, and the collage package, which were then a part of how I would build training material.
This was more evenings and weekends than 20% though, because I did not manage to do what I wanted, and was supposed, to do with my 80%, and depending on the projects on my plate, I would end up being late and exhausted at the end of 80% anyway.
So what now
Again, that’s all on me, but having spent some time to think about it (yeah I do that now), I’m forced to realize that being a private consultant is not what I am, and not what I want to be, and for the most part if I believe the conversations I typically have at conferences, not really what is expected of me.
My initial one-year contract with ThinkR ended a month ago, and I chose to not renew it so that I can take over control of my time. But this time, I’m putting a lot of thought into what I want to do with it.
Some people do open source on the side, some people are lucky enough to work for companies that have a very strong commitment to open source. Apart from that, two patterns that I see a lot are - build an open resume, build enough of a github streak to weight in at some future job application. - rewards. I did good on my day job, I’ll reward myself with some open source work, I’ll have a nice feeling about giving back.
Although I think at some points I’ve matched these patterns, I now believe that open source work is neither a stepping stone nor a treat, but real day job kind of work.
My goal now is to spend 100% of my time doing open source work. I’ve been thinking about this a lot, so here’s how I am organizing my weeks now:
on Mondays, I’ll work on dplyr. There’s still plenty of work to do here, for now I’ve been dealing with issues and bugs, but eventually there is still room for innovation.
on Tuesdays, I’ll work on Apache Arrow to add R as one of the supported languages, following the steps of the python front end.
on Fridays, I’ll work on ergo, which will bridge R and
go. For now that means writing a proposal to get funding.
This has worked well so far. It involves some context switching, but forcing myself to block these days gives me enough focus.
That leaves Wednesdays and Thursdays not associated with strong commitments; I’ll use these days to get involved in miscellaneous projects, not necessarily my own.
Help
I have setup a patreon page to give the community a way to help me stay free and continue to get involved in open source projects with some serenity.
I totally understand that this is not for everyone, and that’s fine. I don’t live in the illusion that I can make an income out of this, but at least this offers a concrete possibility for people or companies to chip in. I believe I’m much more useful to the community when I’m free.
I know we have now more formal ways to get funding, but these typically are project oriented: you write a project proposal, you submit it and perhaps you get some funding. This is what I will do for ergo
, but this does not apply to miscellaneous small projects and various efforts to support the community.
Platforms such as patreon are people oriented and thus are a better match for supporting someone who works creatively, as you support someone’s process instead of the end result.
Every donation is welcome and useful. I arbitrarily setup donation suggestions as powers of 2 (2^(0:10)
) with emoji titles and some indication of what you’d help me cover. The first few goals I’ve setup are all about covering expenses related to how I work, things like internet connection and renting a space in a coworking office.
I want to believe in that setup, 3 days a week with strong commitments to 3 important projects, and the remaining two days with more freedom to be involved in miscellaneous projects and communities.