The problem stared back at me again today. Cold and calculating. Haunting yet welcoming. "I am scared" is my usual response. The vast emptiness of the white canvas signifies the difficult days ahead. The feeling of being lost rings loud and my knees start to tap quickly, preparing for my feet to be ready to run.
As a computer engineer and data scientist, I typically have seas of unrelated data. The problems that customers ask me are usually vague questions—that is, if I am lucky. Usually they start with just a sentence like this ...
Customer: We don't know but can you look into this pile of data and see if there is anything useful?
Me: What data do you have ?
Customer: Some Excels and some of papers of notes.
Me: How long did you collect this ?
Customer: Oh I don't know, two to three years. But we think it has the answer to our production problem.
Me: [Long Sigh]
The Craft and the tool
Data science at its essence is lonely work. You start with the incomplete data that your customers collect and your tool is software called "Jupyter Lab." It mainly consists of many frustrating unrelated tasks that produce a semblance of intelligence. It is a bricolage of many disciplines such as visual art, writing, deduction and logics.
I used to start my day with a set of to-do list items, but those days are long gone. I found it is best to start lost. I have a big table with nothing in it. Scattered blank paper, and my favorite drawing pen. It is a Figma Graphic 3 pigment pen. I keep a mood candle nearby. I learned over the years that it is better to admit that I have no clue how to start than to pretend.
The effort can last one hour, one day, and some of them after months of trying, I simply give up. Out of thousands of attempts, sometimes I find an approach that gives me the delight and surprise that I can not express here. For example, when I discover the power of statisical sampling. Being able to work on a set of data to come up with a complete picture. It is sounds like doing a picture puzzle with a magic wand. To be able to put together 50 pieces of broken images, and then close your eyes and see the images appear in the darkness. Yeah, it is quite magical.
But days like that are quite rare and when they arrive, they disappear like a shy ghost that does not want to be seen. That is my prize in this endeavor. The kind of exciting AI and Data Science that you hear about is only in the movies. The real work is quite different.
Data science is more akin to writing. You start with an empty page, armed with disconnected ideas. Having no clue how the story will end. You will need to let the momentum of the first set of paragraphs lead you through the path of discovery.
You must find the courage in you to wrestle with the unknown. Start there, then you will be rewarded - sometimes, and it is not a guarantee- with additional information that propels your curiosity to take the next step. Like I said, it is like writing a novel.
Hemingway is the model
My idol is not an AI guru like Steve Jobs, my northstar is Ernest Hemingway. Yes, the author. I heard his work while in college, but paid no attention. Only later did I stumble upon his house in Burgette, Navarre—northern Spain. I was searching myself doing a pilgrimage hike from a small village in France and ended at the town of Santiago, Spain. I was tired and took a rest on the nearest bench in that city. Only to find an English plaque with story indicating that this was "Ernest Hemingway" house.
"You must be prepared to work always without applause," Hemingway said. "When you are excited about something is when the first draft is done. But no one can see it until you have gone over it again and again until you have communicated the emotion, the sights, and the sounds to the reader, and by the time you have completed this, the words, sometimes, will not make sense to you as you read them, so many times have you re-read them."
The life of a data scientist is a full of contradiction: It’s a self inflicted act done for the reward of finding the answer that others overlook. While it’s beautiful at the end, it’s also a remarkably lonely and frustrating walk.