The end $ of regex AND/or PROGRAMMING as we know it?

You are throwing a party. It is a big one. You need to create a list of 200+ guests that need to go to your organizers, planners, printers, registries, and all other behind-the-screen magicians who make these types of events happen. Your contacts are all over the place - some in your personal email, some in the work one, and some in that semi-organized contacts app on your phone. You start with the easiest - email. You bulk-paste emails into excel. But they are unformatted... All you want is a clean spreadsheet with name, last name, and email address. Is it too much to ask? Didn't they promise that computers in the future would be able to read your mind? If you know your way around Excel, you will either do advanced acrobatics of split column and a combo of "search & replace". Perhaps you are an advanced user and can create something like **LEFT(e_address,FIND("separator",)-1)**. If you don't know how to Excel your only option is to spend about 60 minutes manually retyping things. But if you can code in one of the computer languages you can use regex!

With regex, things like this take 3 lines of code (depending on which language you are using) and it will take you under 0.03 seconds to run it (depending on which machine you are running it on). Magic. /^(\w+)\..*\.(\w+)@/ . Regex is meant for problems like this - it is a swiss knife for slaying and manipulating text data. In a nutshell, regular expressions, or regex, in short, are set of symbols to find patterns in strings. However, this definition hugely understates the universality and utility of regex. Regex is language agnostic1, so it can be used across codebases, disciplines, and domains. Regex is almost* universal (*there are multiple flavors) since most of the popular programming languages have either a regex library or have it even built right into the language. And the applications are endless: verify whether input fits into the text pattern, find text that matches the pattern within a larger body of text, to replace text matching the pattern with other text or rearranged bits of the matched text, split a block of text into a list of subtexts, and shoot yourself in the foot.

But like with everything in life, there is a catch. Regexes are hard. They are hard to read. They are hard to write. They are hard to document. They are also hard to master: while the rules of a regex are finite and straight-forward, without useful applications learning regex rules is like learning latin for modern diplomacy. And even if you once become extremely proficient in regex, but don't use them often, you will have to look them up again every time. So yes, it will take you 0.03 seconds to run it, but it can take you good 30 minutes to first figure out what the regular expression should be.

For all their simple construction, regular expressions have a metamorphic history. Neuroscientist Warren S. McCulloch and logician Walter Pitts worked on logical calculus models to describe how the human nervous system works2. Mathematician Stephen Kleene extended these models with an algebra notation that he called regular sets/regular expressions 3. A computer scientist Ken Thompson implemented the idea of regular expressions inside the text editor, ‘ed’ to deal with mundane tasks. The result was almost magical - the editor allowed the users to use “wildcard” matching pretty much anywhere in the operating system that text search is required 4 The last boost that brought regular expressions to the commoners was given by Larry Wall when he made regex a core feature of his text-oriented programming language, Perl. In fact regex made PERL a “duct tape” of 1990s web development5. Today regular expressions can be easily found as part of the core libraries across most programming languages. But no matter how many times you wrote regex and how experienced of the developer are you, the unfortunate fact is, that if you don't use regex often you end up re-learning regular expressions over and over again.

Yet for the first time in the history of computer programming writing regex might not be a chore. A new wave of AI tools come to the rescue to let regular humans use machine learning to generate regex from summary definition. While tooling to make regex easier is nothing new (there is no shortage of regex generators and aided regex constructores)6 what is new and different this time is that tools like autoregex.xyz allow you to type in plain English what you want: extract first and last name from email address or replace celsius with farenheitand get straigtforward translation into regex7. Who would have thought that in the age of the overhyped AI capabilities promising everything from curing cancer to helping you babysit your kids, a headline that you will likely see is "Machine Learning will help you conquer regex".

AI-nization of regex is one of the instance of the new wave of computer-aided code writing. In 2021 Github has released Co-Pilot - an AI pair programmer tool that turns natural language prompts into coding suggestions. In 2022 AWS annouced Code Whisperer - ML-powered coding companion (similar thing but in AWS). Both tools let developers type in plain english the definition of what the code is suppose to do and the tool reccomends a code snippet that can tackle the description. Both tools sit on top of branch of machine learning - Large Language Models (LLM). LLM are "trained" on large text-based datasets and trained models can recognise written requests and generate things like articles and dialogue. And while they have shocked and awed us over the last couple of years, in my opinion computer-aided programming, including notorious autoregex, is probably one of the most pragmatic AI use cases out there. Some might point out that translating plain English to code is the essence of programmers' job and hence AI-code-whisperer might diminish programmer's thinking ability just like Google Maps destroyed our navigation abilities9. However for years classic computer science books have argued that the best way to write clear code is to declare what it is supposed to do first in plain English and encouraged "paper & pen programming". Hence typing instructions to our co-pilots to achieve clarity of code is probably what Uncle Bob8 wanted us to do all along.

"We make our tools and thereafter they shape us" famously noted Marshall McLuhan. He was an academic celebrity who shared a lot of thoughts on how media shapes our environment, our outlook and eventually us. (his other famous quote is "the medium is the message"). He argued when we introduce a new medium in society - it changes how we feel, how we relate to the world and in the end, changes how we behave and think. We, humans, are tool builders: we build things that make our life safer, easier, and more pleasant. For a really long time the work of programming computers required patience, attention, ability to retain mental models, systematically debug programs when they go wrong and creatively translate tasks to machine language. When StackOverflow10 came along it changed programming, letting developers around the world solve problems, seek snippets of code for reuse, improve their own code, and discuss technical concepts and freeing our overloaded memory box. StackOverflow flipped the industry from "remembering" to "being skilled in asking questions and searching for answers"11. Did they change how programmers think in code? Maybe, but StackOverflow defintely made programming more accessible to more people.

The AI code generation tools mark the next milestone in professional evolution of code writing. Our concerns around these tools are from the standard Man-vs-Machine theme. If the mighty regex going to fall to automation, is it what is going to happen with other parts of the job of programmers? It is a bit ironic that one of the first jobs becoming AI-automated is the profession that introduced automation in the first place. And while my AI-pair programmer can eventually take my job, I feel a bit of relief thinking I would never have to construct a regex again and finally get a comprehensive guest list to the planners.

[1]  almost, as scripting languages tend to have their own regular expression flavor built-in)

[2]  year is 1946 and the their idea looks like: The neuron allows only the binary states i.e., ‘0’s and ‘1’s. so it is called as a binary activated neuron. These neurons are connected by direct weighted path. The neuron fires if the net input to the neuron is greater than the Threshold. The threshold is set, so that the inhibition is absolute, because the non-zero inhibitory output will prevent the neuron from firing. The connected path can be excitatory or inhibitory. Y = Mc Culloch – Pitts Neuron which can receive signal from any other neurons. W = Weights of the neuron. Weights are Excitatory when positive and Inhibitory when negative. The Mc Culloch – Pitts Neuronhas an activation function f(Y) = 1, if Yin>= ϴ = 0, if Yin< ϴ.

[3]  year is 1956 and the expression looks like By a regular expression, we shall mean a particular way of expressing a regular set of tables starting with single-table sets and applying zero or more times the three operations (passing from E and F to E ∨ F, EF or E∗F).

[4]  year is 1968 and the expression looks like g/pinky/p i.e.(g/regular expression/p) where g and p are modifiers where g was telling the editor to search for the word through out the document and p was to print the results to the screen; global regular expression print, in short now we are calling it as grep.

[5]  year is 1980 and the expression looks like $foo =~ m/fee|fie|foe|fum/

[6]  There are variety of tools that let one experiment with regex: https://regex-generator.olafneumann.org/ | regexr.com lets you paste text and point which parts you want and then generate the actual regex expression

[7]  For curious the responses from autoregex.xyz are ([a-zA-Z]+)@([a-zA-Z]+)\.([a-zA-Z]+)

[8]  Uncle Bob is - Robert Cecil Martin (colloquially known as Uncle Bob) is an American software engineer and author. He is a co-author of the Agile Manifesto. He is famous for writing the book Clean Code about keeping code manageable.

[9]  Numerous books like Wayfinding: The Science and Mystery of How Humans Navigate the World, by M.R. O’Connor , Pinpoint: How GPS Is Changing Technology, Culture, and Our Minds by Greg Milner and Never Lost Again: The Google Mapping Revolution That Sparked New Industries and Augmented Our Reality by Bill Killday attempt to measure the impact of GPS in our life, showing significant impact on our spatial memory

[10]  StackOverflow is a question-and-answer website for professional and enthusiastic programmers

[11]  There is This anonymously published manual "Copying and Pasting from Stack Overflow" is the quintessence of software development techniques. Mastering this art will not only make you the most desired developer in the market, but it will transform the craziest deadline into "Consider it done, Sir" https://www.goodreads.com/book/show/29437996-copying-and-pasting-from-stack-overflow

Let the streets choose your books

How do you pick your next read? Do you get swayed by opinionated bestseller lists? Do you allow yourself to serendipitously stumble on the random read in the bookstore? Do you get persuaded by your erudite friend? Do you entrust yourself into the hands of data-bloated algorithms? While all of these methods are noteworthy means to navigate the repository of human knowledge, I want to recommend a more adventurous method - meandering the streets. You will have to make a pilgrimage to Polanco, a neighborhood of Mexico City, where streets are named after great poets and dramatists. They will guide you on a journey across world literature.

Start at the corner of leafy Schiller. Take slow and spacy steps. You will get to the large prospect of avenue Horacio where you will cross vivacious Lope de Vega, Hegel, Emerson, Lamartine. The traffic will pick up at Eugune Sue and Tennyson. Take zippy Alexander Dumas to Masaryk and you will find yourself in the heart of Polanco. The shopping crowd will industriously ramble through stores between Alfredo de Musset and La Fontane, almost mimicking social caricatures in the novels of both. If you get tired of storefronts, hide in an urban oasis by taking Julio Verne to Virgilio. Virgilio is lively, but with cafes instead of shops. Make your way to street Oscar Wilde, that just like the author is filled with a dashing crowd. Finish the literary tour by escaping to delightfully sleepy Charles Dickens. Residential Goldsmith and Ibsen will charm you. Your promenade just covered quite a few masters of world literature.

The literary streets of Mexico City have puzzled me since I have stumbled upon Oscar Wilde street on my first visit to the city. Alas, extensive googling could not produce an explanation of the curious street-naming conventions. Luckily, my well-read friend who calls this city his home, has suggested a book that would shed the light on the curiosity. Despite the suspicious title "In the shadow of the angel", I entrusted my curiosity into his hands.

And so I got acquineted with Antonieta Rivas Mercado. A bold, vivacious Mexican woman, who although unfamiliar to most, was quite a scandalous personality in the history of Mexico City. Her Wikipedia page is surprisingly brief, yet she has quite a list of contributions. The muse and patron of the young writers, artists, and musicians of the twenties she wiped the cultural life of Mexico City. I could not find the written confirmation on how Antonieta got to name the streets, but one can assume that since her father was a notable architect who designed the independence column in downtown Mexico City and since she stirred the city's intellectual life, she was the right candidate for the job of helping shape the rapidly developing city. There are multiple neighborhoods in Mexico City where the streets are thematically named after flowers, cities, US states, and rivers. When Lomas de Chapultepec neighborhood was being developed, Kathryn Blair, the author of the above-mentioned book, mentions that Antonieta was involved in giving the names to the streets. The neighborhood stretches from "Alpes" to "The Sierra Nevada" and from "Monte Altai" to "Monte Everest". But a more interesting denomination took place in Polanco, where she has chosen the put the writers and poets on the same pedestal as the usual street-worthy names of famous politicians and generals.

When the covid-19 pandemic began, I decided that there could not have been a better time to let Mexico City streets and Antonieta determine my reading list. I have used google maps to zoom into the parts of Polanco, trying to capture all of the authors and looking up their literary works. While some were familiar classics like Dumas and Anderson, some like Calderon de la Barca were personal discoveries. My literary journey let me revisit my childhood favorites like unbelievably farsighted science fiction novels by Jules Verne and savant fables by Jean de La Fontaine. I was forced to take on the classics by Francisco Petrarch and Fredrick Schiller that although heard of I have never actually read. And then there were new beautiful discoveries like Eugine Sue's "The Mysteries of Paris". The list was definitely skewing heavy Western literature, but somehow it was more fun and unexpected than the mandatory school curriculum. Without a doubt, this reading compendium reflects a specific view on writers, yet it allows one to get exposure to specific masterworks. Specific masterworks that a 20th-century intellectual should have been familiar with.

My reading order was unmethodical, but the look of my books was meticulously planned. I wanted my bookshelf to visually capture this bookish adventure, so I searched for uniform publisher book series that make one's library a pedantic look. I have settled on Penguin Classics for their spartan black spines and diverse range of works. And so as the pandemic progressed, so did my pile. Towering higher and higher it visually marked my virtual travel around Polanco.

With the introduction of vaccines, travel became less risky and I could travel to Mexico City. I neatly packed my stack, that corpulently occupied half of my suitcase. And so I walked the streets of Polanco. While the settings and architecture were new, I was exceptionally familiar with the street grid. Each turn of my walk reminded me of the plots and characters of the author who the street was named after. And I kept thinking about Antonieta, about the reasons for selecting books, about her tastes. I wondered if she wanted everyone walking on these streets to get inspiration to get familiar with the works of the subject.

And while few people would ever know about Antonieta or question the naming conventions of Polanco's streets, it feels extremely cool to tell a taxi driver: "Drop me on the corner of Moliere and Castellar".

The smell of strangers

I am pretty certain that the time of the pandemic caused by COVID-19 is going to generate a flood of essays. Humans find it essential to express their frustration, nostalgia, ways of pass time, and aspirations. I am no different and here I am trying to express the thing that I miss most. Most of all I miss the smell of strangers.

As we spend time in our confined spaces, the smells that we experience daily become conventional. The smell of grounded coffee, while distinctive occurs regularly every morning, becoming a routine. My perfume, while invigorating, is redundantly familiar and becomes customary in a matter of minutes. My special evening treat - aroma candle, is rapidly losing its appeal. What I miss most is not the pleasurable smells in terms of caliber, but rather the unpredictability and lovely of olfactory experiences. The irritating smell of a stinky gasoline truck. The sweet and sour smell of sweat in the subway. The smell of worn-out leather in the back of a taxi cab. The someone’s “too flowery” perfume as they share an elevator with me. I miss the strange smells that catch me off-guard and make me take notice.

While confinement due to the pandemic has deprived us of multiple unexpected senses like vision and touch, I will argue that smell is the hardest to simulate in our dwellings and hence the most pensive. While I miss the outside noises of my city, I currently have access to most of the music ever produced by humans. Besides, through cinematographic experiences varying from short vlogs to movies, I can encounter most of the sounds I yearn for - the whisper of leaves in the forest, the cringing of snow during winter hikes, the energetic typing sounds in the offices. The same can be said about the visual experience: while my visual range is bound mostly to white walls of my apartment, different mediums constantly entertain my visual senses. I can take virtual tours in the museums, scroll through pictures of friends and strangers across costal media, attend a performance at MET opera. Not to mention the variety of visual stimulation from a myriad of movies from classics to blockbusters. Which makes me to explicitly state the obvious: the enforced lockdown could not have been so good in any other decades. In the times, when people owned either tv with select content and specific viewing times or did not have anything at all (basically pre-19th century) all of the senses would have put humans on a very different kind of trial.

So back to the thing that it seems I miss most. There is no way to easily access foreign smells in the privacy of one’s apartment. Even if you frequently open the windows. The lack of small simulations is caused by the human inability to capture them in the first place. While I can get a new perfume, it is hard for me to capture the smell of the fresh paint in the classroom at the beginning of the school year, upon return from summer vacation; the smell of my favorite bakery right after they produced the fresh batch of croissants or the smell of the forest in that particular part of a mountain range in the middle of Peruvian Andes. Psychologists have shown time and time again the easiness with which smell captures our memories. I frequently choose a different perfume for different prolonged chapters of my like and loathe partying with empty perfume bottles, since even one sniff from the old smell can instantly bring the memories of a particular period. Yet, besides perfume bottles, there is little left for me to re-experience the places, other than to physically return to them, hoping that they have not changed much and still smell the same.

Here is another complication to my reminiscence - on most of my current outings I am required to wear a mask. My mask is a regular dust mask, but its fabric has a very distinct smell. So my mask deprives me of the full of “strange smells” on my rare grocery outings. Since it is my personal belief that the masks will persist as a required outfit for a while (at least in New York City), that makes me crave strange odors even more. Ironically, the smell of my dust mask will be another one of the smells that will capture a specific memory at a particular period of time.

As the whole world has shut its doors with devastating consequences for the well-being of millions of people, each one of use is planning to take note but this transformative experience, promising to appreciate it more. And while there is no certainty when the streets of our cities will be crowded again, I am hopeful that this July in the crowded card of New York Subway on the hot August afternoon, I am going to be deeply inhaling the smells of strangers and smiling.

On book covers in public places

The year is 2019, and efficiency is all the rage - my contemporaries obsess with optimization of their daily activities, trying to avoid burnouts with forced quiet periods of meditation. It seems everything became efficient - efficient learning with short YouTube tutorials, effective cooking with pre-packaged delivered-to-your-door meals, efficient exercising with 5-minute workouts. “Efficient reading” takes form of audiobooks to allow for multitasking and book summaries to expedite information consumption - more and more accomplished and digested for you in five times faster and 10 times less effort. And yet, in the world of Kindles and iBooks there is there is something to be said about exhibitionism of your reading preference in public by carrying a good old fashioned paperback.

A couple of days ago I was standing in line to the train Philly-NYC reading “The Captive Mind “by Czeslaw Milosz, when I heard someone addressing me “It is not frequent to see someone reading Czeslaw Milosz. Are you polish?”. I responded that one does not need to be of polish descent to appreciate the Nobel prize winner in Literature. It was a perfect conversation starter and within minutes he was telling me about his family that moved from Poland and how this book had a deep impact on how he would perceive post-war Europe from across the ocean. He told me about the trips he took to Europe, his nephews taking a summer trip to Eastern Europe and his research as a PHD student in political socialogy. We had to finish the conversation since they had started boarding my train.

Another day I was walking around New York’s Soho with “Towards a new architecture” by Le Combustier, when a man stopped me a told me that he read it during his freshman year. This sparked a spontaneous discussion about the new architecture projects in New York, somehow transforming into exchanging tips for the upcoming design week.

Another book, that was a huge success in igniting vivid conversations with strangers was “The Power Broker” by Robert Caro. People took pictures of me reading the book on the train, informed me on how long it took them to finish the book (1 person told me it took him a year, while another woman confessed that she just could not put it down until the last page.) and wanted to know if I have read the newly published autobiography of the author.

These recent incidents made me conclude that first people read, and read vividly (still, in spite of apps with nook summaries). Furthermore, it appears that indulging in public reading will result in delightful conversations with strangers. Public reading of paper books seems to be a kind of virtuous exhibitionism - we bravely demonstrate our tastes and interests in a way that does not require us to transform appearences and grooming.

The duty to read and finish “bad books”

You enter a bookstore. The front shelves are almost always occupied by “bestsellers” - the ones that latterly everyone seems to be reading and talking about. Further you encounter the “classics” - the ones who withstood the time and commonly agreed to be of importance in their contributions to humanity or history. Further away are domain experts arranged by thematic shelves - the typical pundits of the subject matter. But if you venture deeper into the depth of the bookstore, or browse outside the “eye level shelves”, you will find “the others” - the ones who never made the cut.

So you become adventurous and decide to give a chance to the unknown book, not previously reviewed by your friends or seen on the Amazon's “recommended”. However after a couple of pages (or chapters) you realize you are not enjoying it at all. Reasons canary from distaste of the writing style (the language is too complicated/simplistic/deficient) to the subject (the topic is not covered/irrelvant/lukewarm/complicated/difficult to grasp). And so you place the label “bad book”. Sometimes we don’t even go through the mental exercise to elaborate to ourselves why we don’t like the book, we just easily toss it the aside (whether with or without intention to come back to it) and move on.

Both hedonistic and utilatarian booksworms would probably agree that it is the right decision, since reading is a pleasurable activity and should be spent on good books. I, nevertheless, want to proclaim that it is our duty to read and most importantly finish “bad books”. Let me start my argument with conscious selection of bad books. Venturing outside of popular reading list can be as advantageous as diverging from your favorite flavor of ice cream (“are you sure you like vanilla? how about raspberry-mint”). It may seem like a very self-evident argument, but I am writing this in 2019 - in the time of the rule of content bubbles and recommendation algorithms that rarely let us break outside of similarity patterns. In the past browsing in bookstores and libraries gave one a sense of serendipity, stumbling on interesting cover, the unexpected title, the new section. Unfortunately amazon rarely gives you an overview of their complete database with a content marker "You are here". Hence frisky reading choices are mandatory to leave the realms of self-refferntial knowledge loop.

An unexpected reward of reading "bad books" is the perspective on appreciation of the good ones. After making a detour into irksome topic or genre, it is always a true delight to come back to your personal favorites. But even better, they feel more nutritious, your want to savor them longer, you pay closer attention to the words used. Whether we return to olf-favorite authors, or discover new ones, the process of reading the books that correspond to our taste feels more acute and it feels like ideas penetrate deeper. Hence I can not imagine enjoying my favorite written works without the exposure to underwhelming and somethimes quite commonplace literature. In a way it sharpens your taste.

In case none of any of the points above convinced you, my last justification to finish the book is the benefit of discipline and the responsibility of finishing things. By finishing the book we allow the other person finish talking without interrupting: regardless whether the person is a sage or a fool. By finishing the book we demonstrate consideration and ability to entertain ideas we do not share. In the end the author did put the time to convert his thoughts into sentences, so the least we can do is to give a little bit of our time to contemplate on his/her musings. While reading a "bad book" I like to conduct an imaginary dialogue with author questioning, but I do think it's mandatory to let them finish the sentence...

Not every book we read should be a page-turner, but some that we come across can demonstrate the spectrum of variation in human writing. So be adventurous and pass by those bestsellers and classics, heading to the "un-reviewed" and the "un-starred". In the end you are are around 50,000 words away from considering something you have not considered before.