Introduction

  • "We must do this!"
  • "Ok. Calm down. Let's first talk about it."

I have to be honest; I struggle when someone comes to me with an idea for a data science project. Most of the time, the person recycled and adapted a concept read in a Forbes, or even a Business Insider article, to their needs. They do this while ignoring the companies' different contexts.

The result is a Frankenstein-like project.

This kind of project can be super exciting and even technically sound but often unfounded and destined to fail, and the business won't see any real, positive impact. Why? Because there is no real data strategy supporting the decision to develop those projects.

What is data strategy

Data strategy is a plan of action that will show you how, when, and where to apply data to your business. It's holistic, in the sense that it considers the whole organizational context: people, culture, means, and yes, technology too. It should be actionable for your specific organization and also flexible enough to adapt to evolving contexts.

Its objective is to deliver a massive positive impact through data, now and for the years to come.

For you, data strategy might be something different. Make your definition and let it evolve.

I've split my definition of data strategy into five pillars: FORCE, or more specifically:

  1. Foundation;
  2. Observation;
  3. Resilience;
  4. Competence;
  5. Expansion;

I'll go into detail about each of these throughout the book.

Why do companies need data strategy

Failed projects make up the majority of all Data Science and Artificial Intelligence (DSAI) initiatives.

Many times these projects are technically functional but cannot meet the expectations.

These are some of the typical reasons behind the failures of those individual projects:

  • Missing or dirty data;
  • Unavailable or uninterested stakeholders;
  • Inability to deploy at scale;
  • Inability to reproduce previous results;
  • etc.

There are also organizational killer reasons, for example:

  • Hard to reach information isolated in silos;
  • Each own department has its own sources of truth for the same data;
  • No standard quality control in many of the data sources - particularly in less technical departments;
  • Repeated/duplicated work done by multiple teams or departments to solve exactly the same problems;
  • etc.

Both of these lists are incomplete, but for me, the real reason is apparent. These companies lack a comprehensive data strategy.

In order to consistently apply DSAI projects successfully , you need something like the FORCE explained here in the book.

Without Foundation, your projects will either go ignored or simply get stuck, yes, that’s the technical name. Without Observation, you won't be able to understand what's happening in your company; you won't understand where data is needed or what work should be automated. Without Resilience, your systems will catastrophically fail at the minimum stress and when you most need them. Without Competence, you'll be optimizing the wrong processes and will be incapable of noticing any impact. Without Expansion, you'll be going in blind, developing new projects and services that no one actually wants.

Data Strategy prevents this. "How" is what you'll learn in this book.

Why another book on data strategy

Feeling powerless is probably one of the most common things that happen to managers when applying Data Science and AI to their businesses. Many of them feel lost and confused with all the available information. This information is often overcomplicated and what is supposed to give them clarity, gives them instead an overwhelming sense of disorientation.

I’ve yet to see a practical and lightweight data strategy that most companies can apply described in a book or in a university course. Most of the work done on the topic is either theoretical or targeted to big corporations and governments. This makes it hard to be implemented in small or medium-sized enterprises (SMEs) and their attempts, in most cases, fail. I am not a university professor, and I also don't have the support of a large 100-year consulting company. But I have enough experience to say that these two types of data strategy do not fulfill the needs of 99% of companies.

SMEs represent the large majority of companies around the world. They make up 99% of all businesses in the European Union, 98% in the Asia-Pacific region, and 99.9% in the US. These companies can't afford to waste money on failed experiments and hope to convert them into highly rewarding marketing gimmicks.

Small companies can’t play around like this. Those are the companies I want to help. To maximize their chances of succeeding, I wrote this book the most practical and clear way I could. I tried to stay away from unnecessary fancy words and write the book so that anyone could read it even without a science degree. After reading this book, you will understand how to apply data and get value from it. Otherwise, I've failed to deliver the value I wanted in this book.

I want to bring data science to those 99% of all companies. I want it to be available to everyone which is also why I'm publishing the book online and for free until all the chapters are out. Does this make it better than others? Definitely not, but I hope it fills its necessary space in helping the ones who need it most.

Who should drive data strategy

I'm going to be short and blunt about this.

Delegating your data strategy to your CIO (Chief Information Officer), CDO (Chief Data Officer), or IT department is a recipe for disaster. Data is not just an IT asset. In any data-driven company, data is transversal to all departments. And so are the impacts of its application. It should be respected as such by the people directing the company.

As a CEO or COO, you cannot expect your technical people to do all the work of identifying needs and creating new data initiatives for themselves. The probability of these initiatives being complete is low.

Besides, if you are serious about data, it's your responsibility to get involved in the process and to make sure the whole company understands how much you support it. Suppose the process is delegated to an IT or data science department. In this case, other departments will likely disregard it or even totally reject it. Data Strategy needs to be informed by operations and strategy and supported by data.

Who am I

I've committed most of my life to science and technology. Since I was a small kid, I knew I wanted to be a scientist.

I started programming when I was twelve. The computer we had at home didn't have internet and wasn't fast enough to run games either. I was also not interested in team sports and didn't have many friends. These obstacles were necessary to direct my focus and attention to what was considered a very weird hobby for a 12-year-old back in 2002.

Everything was self-taught, mostly from reverse engineering scripts I found online and trial and error. Since then, programming grew to be my most important leverage in both personal and professional projects.

I've been a data scientist for eight years now. When I started, not many companies were aware of the potential of data science, and I was fortunate to not only witness but also be part of that expansion. Professionally speaking, I've committed my professional career to help organizations use data more effectively and to develop techniques and strategies that make it easier to collect, extract, process, model, and communicate everything related to it.

For most other careers, eight years is nothing. However, my experience isn't irrelevant.

As a researcher for UC and MIT, I got to work with market leaders in Portugal and Singapore, and with air quality data from MIT's campus in Boston, Massachusetts.

At Entrepreneur First, in London, I worked with data from social media and transportation and applied it to the outdoor advertising industry.

As a consultant, while still in London, I helped companies like Kheiron Medical, Hire Space, ReachX unleash the power of data.

As a mentor, I helped tens of aspiring data scientists start their careers and grow their projects.

Today, I'm the co-founder and CEO of Enlightenment.AI. EAI helps companies of all sizes and in all industries understand how to capitalize on their data.

Naturally, there was a progression to where we are today. I started as a technical guy; however, as I applied data science to businesses, I began to develop a new kind of interest. This interest was unfamiliar to me and it felt almost unhealthy as I started to get more and more curious about business, management, efficiency and effectiveness. Learning about these topics changed me and the way I saw data science. This naturally evolved into a situation where clients no longer wanted me only for my technical skills. Directors and C-level executives were now asking for my advice on how they could strategically leverage their data.

Why should you read this book

Efficiency and productivity are probably the keywords that best define all industrial revolutions we've seen until today. These words probably also sit behind your motivation to learn more about what it means to be data-driven and how this, together with automation, will change the future. In this book, you will learn how to apply these concepts to a business to maximize your initiatives' success.

During the first industrial revolution, hand production methods, particularly the repetitive ones, were automated. With digitalization, the same is happening with many processes that were once done by a human. But we're doing it so so badly. Why do I say this? Let me give you a few examples:

  • The marketing specialist that has to manually log into multiple social networks and platforms to send the same message to their followers.

  • The account manager that inputs manually each order for the hundreds of clients they are managing.

  • The admin worker that logs in into the company's bank website to get the transaction and balance details for a monthly report.

  • A worker in the financial department that still stores the company’s invoices on paper and inputs their values into the computer manually.

These examples are widespread and extremely inefficient practices. Humans should not be doing this. All these processes must be transparent and automated.

Nowadays, many people are becoming more interested in data science and AI. It's common to see people wanting to skip steps. I believe a company should not apply data strategy and AI until their basic, fundamental processes are truly digital. Let me explain.

For a lot of companies digitalization means moving their paper forms to digital forms. This facilitates storage, transportation and querying of that data. That is, however, an underutilization of technology and not much of a revolution. Often, the information lives in digital systems but these systems do not communicate with each other. Companies then hire someone for the sole purpose of manually inputting and transferring data from one system to another. This doesn’t only happen in small companies but also in large international enterprises. Perhaps even more so due to their large variety of systems.

What we have right now is not digital. It's papers on a digital screen.

Of course, besides automation, being data-driven is now more relevant than ever. Your decisions will have long-term impacts, and ensuring you're making the right decisions is the core of every successful company. However, most managers can't even keep track of what is going on in their businesses. The data is not collected, has no integrity, or is simply too difficult to extract, move, and communicate. I will talk about this in the next few chapters.

With AI, companies that are already integrating information from multiple sources will be able to start automating some of the decision-making. Sometimes, AI can replace a knowledge worker, at least in part of their choices. This replacement is already happening today and I will give examples along the way.

The industrial revolution forced not one but all industries to adapt. The ones that resisted or delayed that change for too long didn't survive. You might resist digitalization, automation, and data-driven decision making but, please, know that you will most likely have to change the way you do business.

Instead, imagine accepting current innovation. Imagine a company that applies technology beautifully. A company where people will barely sit in front of their computers to do mundane, repetitive tasks. Instead, technology is transparent, working in the background and freeing people up from the jobs they hate. People can now focus on their most human work: creative problem solving, building relationships, creating something new.

This is the new way of doing things. Companies will need to adapt and adapt fast. Those who do will have a compounded return over the years to come, leaving all others behind.

Book structure

The book is structured in two parts.

  1. FORCE
  2. Business-friendly technical concepts

In the first part, I'll describe FORCE and what it can do for your business.

In the second part, I'll present the most important technical concepts in a business-friendly way that anyone can understand. I won't use formulas or very scientific names, except when absolutely necessary. In those cases, I'll make sure to explain them in simple words as well.

Rules

I wrote a few rules for me, the author, and for you, the reader. I know what you are thinking: "Rules? For a book?". Yes. I wrote them to make sure I do not deviate from my objectives with this book and also so you can take the most value possible out of it.

Rules for me, the author

  • I will not use unnecessarily complicated words or unnecessary jargon;
  • I will keep conversational tone throughout the whole book;
  • I will respectfully simplify concepts to the best of my knowledge;
  • I will not use formulas;
  • I will keep the book as nontechnical as possible;
  • I will focus on the impact as much as possible;
  • I will focus on business as much as possible;

Rules for you, the reader

  • I will understand this is just a framework, every organization is different, and these are not a set of rigid rules I need to follow;
  • When applying these guidelines, I will tailor them to my organization;
  • I will reserve focus time for this book, take notes and take some thinking time whenever needed;
  • I will contact the author if there's something I disagree with or that I think is missing (yes, this one is optional, but very much appreciated!).

Start today

If you feel this text resonates with you, today is the best day to start reading and applying the information contained in this book.

If you are interested in strategic data science and AI, join our data strategy group on Telegram and feel free to ask me or the community any questions you might have. Tag me if you want to share your stories or if there is anything you would like to add.

This book is and will always be under revision and your participation is much appreciated.

Thank you!