Tips for transitioning from Psychology to Data Science

8 minute read


Currently, breaking into the amazing world of Data Science and Artificial Intelligence is becoming more and more difficult. Especially so for those with a non-technical background.

In this article, I would like to discuss the possibilities for Psychologists, or those with a socially-oriented background, to find a job as a Data Scientist.

As a Psychologist, you have several advantages over those with a purely technical background:

  • You have received in-depth communication training

  • You are a domain expert in your respective field (e.g., Economic Psychology or Clinical Psychology)

  • You are familiar with statistics and perhaps even more so than your technical counterpart

  • You have experience with small datasets

However, you might have an equal amount of disadvantages:

  • You are unfamiliar with the computer science domain (e.g., creating production pipelines, unit testing, git, etc.)

  • You are unfamiliar with the necessary math skills (e.g., calculus, linear algebra, etc.)

  • You have little to no experience with Data Science related algorithms (e.g., machine learning, NLP, process mining, information retrieval, etc.)

In this article, I will explain how you can use those advantages to your benefit and what things you can do to compensate for those disadvantages.

Hence, the main message in this article is simple:

Play to your strengths and improve upon your weaknesses.

NOTE: Many of the tips here are generalizable to other backgrounds. However, they are firstly targeted towards those with a social background. I would have changed some tips if another background would be considered.

1. Learn a programming language

It might be a bit too obvious, but learning a programming language can be, especially in the long-term, more important than you might think!

Which programming language should I choose?

This is highly debated and depends on the sector you would like to pursue. In general Python and R are currently mostly being used for Machine Learning and Statistical applications.

If you were to search for a job that is not highly technical, but more analytical with the occasional prediction model, then I would highly recommend R. R has been around longer in companies for doing data analyses and there are still companies that have not switched yet to Python. The language excels at doing quick and relatively in-depth statistical analyses.

Moreover, as R was first used by statisticians, there is a good chance you have worked with it before as the Social Sciences are typically statistics heavy.

On the other hand, if you really want to focus on complex algorithms or on production pipelines, then I would recommend going for Python. Python is the go-to language for Data Scientists that want to put their AI-model into production. It is highly flexible and, compared to R, has a wider range of use-cases.

Although technically not a programming language, SQL is indispensable when it comes to accessing and analyzing the data. It is typically used for querying information stored in a relational database. Especially for psychologists who have a non-technical background, you can quickly learn how to do basic analyses.

To what extend should I know how to program?

This depends on the type of Data Scientist you want to become. If you want to help the business make decisions then I would advise you to understand the basics of SQL and R. However, if you are looking for an algorithm-heavy job where you put models into production, then it is key you are nearing the knowledge and efficiency of a Software Engineer.

What other skills should I focus on?

There are a few things though that I would recommend to learn in order to make your life easier:

Git is a version control system that helps you track changes in your code. I have seen many data scientists create copies of their notebooks/files and calling them V2 to add features to their solutions. This is not only inefficient but makes it difficult to properly version your application and track inconsistencies.

Use a proper IDE when creating your data-driven solutions. For example, using Pycharm instead of Jupyter notebooks will help you write better code as there are many options that will help you track problems.

If you want to take things a step further, you could look at the following:

  • Unit testing

  • Analyzing running times

  • API development

  • Docker integration

2. Get experience

There are many ways in which you can get experience in this field. The ones below are those that I felt would be the greatest benefit in making the transition.

Internship

Having one or more internships on your resume is, arguably, the most important thing to land you a job as a Data Scientist. In my experience, employers are looking for employees who have seen the messy world of data in business compared to the relatively pristine data you see in academics.

An internship will also help you understand the language that is spoken within the field of Data Science. People heavily use heuristic and biases in their decision making. So when you talk like a Data Scientist, they are more inclined to consider you as one.

Moreover, use your statistical skills to your benefit. Many start-ups and smaller organizations would love to have somebody in their team who can analyze their small datasets and at the same time clearly communicate those results.

Create a portfolio

A portfolio can help you communicate a wide range of skills and projects that might be relevant to a potential employer. Not only that, but it can also be used to learn how to properly explain technical principles to people who have little understanding of the field. An important skill to have!

I would suggest you have one of two things in your portfolio:

  • Either several projects within a single specialization (e.g., deep learning)

  • Or several projects across a wide range of specializations to demonstrate a wide range of abilities

3. Use your background to your benefit

After I switched from Psychologist to Data Scientist I wanted to be recognized for my technical skills. I had worked so hard to get the skills necessary to call myself a Data Scientist. Employers often would say that my Psychology background would be helpful when translating Data Science / AI solutions to non-technical stakeholders. However, I wanted to work on those solutions myself! There were even times when I left my Psychology background from my resume to only be recognized as a Data Scientist.

In hindsight, this obviously was not the right thing to do. What did help was actually simple:

Focus on a field where knowledge about Psychology is mainly seen as domain-knowledge, not as the ability to communicate technical matters well.

For me, this resulted in a position as a Data Scientist where I would focus on analyzing and predicting human behavior.

NOTE: If you enjoy being the bridge between Data Scientists and Stakeholders, then your social background combined with basic Data Science knowledge should be sufficient for pursuing such a role.

4. Learn BI tools

Using BI tools might not be the first thing that comes to mind when you think about Data Science solutions. In practice, learning those tools is more important than you might realize.

If you work in a non-research environment, then there is a good chance that your solutions are going to be used by non-technical stakeholders. Those stakeholders typically already make use of BI tools to drive their decision-making processes.

In order for your prediction model to be used by those stakeholders, it would be best to integrate them into their existing workflow. Knowing which BI tools are being used would help in integrating the output of your model in their workflow.

A few of the most popular tools are listed above (i.e., Qlik, Tableau, and PowerBI). I would suggest to at the very least make a single dashboard with all of the tools to understand the basic workings of those tools. Then, choose your preferred method and dive a bit deeper to understand the data architecture of those applications.

5. Educate yourself

The disadvantage and advantage of having a socially-oriented background are that employers are more inclined to think your skills are best suited for a communication-heavy position. In order to have them acknowledge your technical skills a degree, whether from a Master’s program or online courses, might be of benefit.

Master’s degree

Pursuing a Data Science Master’s degree after having completed a socially-oriented program can be quite difficult. Compared to those with a technical background there is a good chance you lack the necessary technical skills, such as programming, linear algebra, calculus, data structures, etc.

Thus, it is important to look for programs that will help you get those necessary skills in an accelerated time-frame. Some programs have a nice balance between technical and social courses that would suit someone with a social background.

I would advise looking at either a research Master’s program or at a Data Science Master’s program. The former is often within your field and allows you to combine it with advanced analysis skills. Especially if you focus on predictive modeling in your Master’s Thesis. The latter will introduce you to common Data Science algorithms and methods while maintaining a nice balance between Data Science and Business courses.

Although many advise pursuing a Computer Science or math-heavy program, I feel like that it would be in too much of a different direction for a Psychologist.

MOOC

For many it is difficult to enroll in a new Master’s program due to financial difficulties or if they are already working full-time. A great solution is following MOOC’s which are Massive Open Online Courses. In other words, simple courses you can follow online.

What makes MOOCs so perfect for learning a new profession is that you can work on those courses in your spare time whenever it suits you. They are often even cheaper than a regular Master’s degree!

The problem with these courses is that some research is needed to identify which courses are worth your time. Some are known to be great courses such as the Machine Learning and Deep Learning courses on Coursera by Andrew Ng.

My personal preferences for platforms are:

NOTE: The courses on Udemy are often on sale which could save you hundreds of euros/dollars!

For me, making that transition took several years of high effort. It is a difficult journey, but believe me, it is definitely worth it!