What’s in a data analyst? Survey suggests Python user base mainly devs

What’s in a data analyst? Survey suggests Python user base mainly devs

Last November, the Python Software Foundation and dev tool creator JetBrains took the pulse of the Python community for the third time. The results are out now, suggesting that Python users have gotten more into containers and mostly define themselves as developers, no matter how much data science they do.

The Python Developers Survey 2019 was apparently answered by more than 24,000 Python users from over 150 countries and gives insight into what the language is used for and where. The “what” part has been relatively stable in the last three years, meaning that most respondents still say to use Python for data analysis purposes (58 per cent), followed by web development at 49 per cent. Other often cited project areas include DevOps/sysadmin/automation and machine learning (39 per cent, respectively).

However, looking into job titles used by those taking the survey, only 19 per cent would describe themselves as data analysts. The vast majority (73 per cent) feels the developer or programmer job title more fitting.

With the last release of the Python 2.x series just out and support canned, it’s good to see the number of Python 2 users has dropped down another 6 per cent in 2019, leaving only 10 per cent of respondents actively using it. Most of those can apparently be found in the fields of web development (45 per cent) or DevOps/sysadmin (41 per cent), which could be down to web development being a more mature field with more legacy code in place as, say, machine learning, which is currently more quickly evolving.

Speaking of data heavy use cases, asked about which tooling they used for their data science tasks, numerical computation lib NumPy was the clear top-runner with 63 per cent, followed by Pandas (55 per cent), matplotlib (46 per cent), SciPy (36 per cent), scikit-learn (33 per cent), TensorFlow (26 per cent, PyTorch meanwhile came in at 15), and Keras (20 per cent). Apache Kafka seems to be the most popular big data tool (13 per cent) in the Python space, while Flask takes that title for web frameworks (used by 48 per cent) before Django (44 per cent). 

In the cloud, Python devs aren’t too different from everyone else. 55 per cent cite AWS as their cloud platform of choice, followed by GCP (33 per cent). Interestingly and although the percentages are said to have dropped in 2019, Digital Ocean (22 per cent) and Heroku (20 per cent) still come in before Azure (19 per cent) and PythonAnwhere (12 per cent). But then again, almost two thirds use Linux in some capacity, so maybe it isn’t that surprising that the Microsoft cloud doesn’t get more interest.

Containers are still gaining ground amongst Python developers, with now 47 per cent (40 in 2018) using them to run code in the cloud, meaning the approach has overtaken virtual machines (46 per cent). Development for the cloud is still mostly done locally with virtualenv (56 per cent), though using Docker containers made a jump form 35 per cent to now 41, with virtual machines (22 per cent), local system interpreters (18 per cent) or remote development environments (17 per cent) further off.

As in the 2018 edition, the large majority of respondents claimed to use Python as their main language (84 per cent), with 58 per cent being able to use it in personal as well as work tasks and the rest being equally divided into work or play/personal projects only. 

67 per cent of participants stated to be fully employed by a company or organisation, followed by students (10 per cent), working students and self-employed workers (6 per cent respectively), and freelancers (5 per cent). A bit more than half of the people asked (53 per cent) work in a team, which in 75 per cent of the looked into cases is a workable unit with two to seven people. 

According to the survey, employers of Python users can mostly be classified as IT or software development companies (42 per cent), which target the same industry above all else (45 per cent; other target industries such as finance or sales only start to trickle in at 4 per cent and below).