Tech Interview: How Machine Learning & Data-Driven Development Help Drive Innovation

Data scientist, Tomek, explains how adopting the latest technology, from machine learning to automated analytics, helps drive the development of MyTherapy and improve key metrics such as user retention

Dan Brown Jun 11, 2021

Dan is a journalism graduate from the UK. He aims to make the complex topic of digital healthcare accessible for both patients and experts, using simple and concise language.

View Dan Brown’s posts

Data scientist, Tomek Plominski, tells us how the team adopts technology such as machine learning and automated analytics to help support the development of MyTherapy and gain insights that are crucial for the company and our partners. Read about these projects and how the team has the freedom to embrace the technology that drives the company forward.

I understand you’ve been working on projects involving machine learning. Can you explain a bit about what they involve?
So, one part that is already done is a machine learning model that anticipates when users are likely to stop using the app. We use data on individual user demographics, app usage patterns, and granular activity information to make predictions. This information can be used to help us boost retention rates, particularly among key groups of users.

The next step that we want to take is about pushing content to users. Of course, we want to make sure we don’t send content to users in the middle of the night, for example, but we can also use machine learning to figure out the best content to send to users at a given moment. We might want to send them information about a feature they haven’t discovered yet or something lifestyle-related that might be interesting for them based on their usage of the app. These things can have a big impact on the user experience.

Another thing we are doing is a package recognition feature, which is available for new users in Germany during onboarding. Users can scan the package and the app finds the best match from our drug database. It’s not a trained model, so it’s not actually machine learning, but it involves optimizing OCR (optical character recognition) and fuzzy text matching. It works reasonably well at the moment but would definitely be something that can be further developed with the use of image-based data processing and machine learning if we decide to go in that direction.

Do you want to take advantage of innovative tech to support the development of an app that helps millions of users living well with their disease?

Check out our vacancies.

I imagine a fair amount of time and effort has gone into these projects. What are some of the benefits?
A lot of focus is on boosting user retention. At the end of the day, the best way of doing that is to make the app more useful for people. If we can push content to people that is informative or lets them discover a feature they didn’t know about, that increases the value of MyTherapy to them. From a business perspective, user retention is even more important since we became a part of Shop Apotheke.

Demonstrating that we have excellent retention and engagement is also important for our partners, as it shows that MyTherapy can work very well as a platform for their patient programs. These are important issues for our partners, as the programs are usually for quite serious drugs. It’s also why we have put a lot of effort into automated analytics, to help show that the work we put into improving functionality does affect the way people use the app.

Can you tell us a bit more about automated analytics?
As I mentioned, one of the main reasons for tracking app usage is to deliver the key information to our partners. This is done through aggregate reports on a weekly or monthly basis, so we made the commitment to automate it as much as possible. We have provided these reports for a while and the old method, which was based on connecting to database replicas and running scripts on a local server, was not optimal and has multiple issues. We decided to move to AWS, where data can be collected from different data sources and we have much more control over how we can use it.

So, now we can run automated reporting that just collects the information and prepares the documents for our partners. Aside from that, the AWS dashboard generates graphs and tables that are readily accessible, and we devised a fillable template to generate more in-depth custom-set data aggregates. So, if you want information about a particular group of users, or a certain disease or drug, for example, this template can help you access that information much more easily.

We’re talking about millions of events daily, so it’s not that easy to make sense of the data. These improvements make it easier for product owners, for example, to find the information they need. And it helps support us in data-driven development.

It’s much easier to see whether functionalities are working well, how people are using them, and what can be improved. It allows us to be a lot more confident in some of the decisions we make and takes out a lot of the guesswork.

Do you want to take advantage of innovative tech to support the development of an app that helps millions of users living well with their disease?

Check out our vacancies.

You mentioned moving over to AWS. How have you found working on AWS rather than on local servers?
Even though we’re growing a lot at the moment, we’re still not that big of a company. So, working in AWS makes it much easier to function for us as a team without some dedicated internal support from the backend or DevOps. We don’t have to worry about setting up servers to run our scripts, we can just use serverless solutions and a lot of work that would need to be done manually is automated.

The other advantage is we can use a big data-oriented database that grants all types of data, meaning we can collect different sources. We have several databases – one for each partner program, for example – we have one for accounts, one for production, and so on, as well as data with different structures, such as logs, app analytics, or Elasticsearch indices. We, therefore, chose to build a Data Lake in AWS; being able to collect it all in one place and use scalable big data solutions creates a lot more possibilities for reporting and allows us to run data exploration much, much faster than if we were still using disconnected database replicas.

Previously, when we wanted to look at user activity over a year or two years it took up to an hour to run a single query. Now it’s a matter of one or two minutes.

What you’ve spoken about reflects what a couple of other developers have said in these interviews: that there is an ambition at smartpatient to keep pushing and not just resting on our laurels. What are your impressions of the company mentality in this respect?
It’s something I find really cool because we have so much freedom in terms of research and what we can propose as a team. It can be the case that a team like ours concentrates on one or two solutions that are just improved continuously. But if you go beyond that approach there are many options and new technologies coming along all the time.

One of the reasons we’re growing as a team is to take advantage of more of these possibilities, which is really exciting.

How do you, as a team, decide where to focus? Although we’re growing, we don’t have endless resources, so how do you decide which areas to explore and implement?
Of course, we need to take a sensible approach and be confident that anything we decide to do has a positive impact. The main thing is communication and making sure we make decisions collectively. Regarding the automated reporting, for example, we realized that these piles of work coming in are only going to get bigger and more frequent, so we needed a way to decrease our workload on these tasks. We knew that it is something that can be done automatically, or almost automatically, so that’s what drove us to get it done.

Regarding the machine learning models and package recognition, that is something that is more closely related to the wider needs of the company. We know that boosting user retention is a key goal and there are a lot of different ways to approach it. From our perspective, we are trying to gain the maximum amount of value from the data we have access to. In the past, we mostly just used it to analyze bugs and tickets to help the developers identify which parts of the app were causing the problem or where some refactoring might be necessary. But we knew there were opportunities to do more, which is what led us to the projects I’ve discussed.

Do you want to take advantage of innovative tech to support the development of an app that helps millions of users living well with their disease?

Check out our vacancies.

From your personal perspective, what have you kind of gained from working on these projects?
For me, having some responsibility for choosing the technology we make use of to solve problems or improve how we do things is a big positive, and these particular projects have helped me develop my personal skills and knowledge. It is also satisfying to be able to shape the whole product in some way; I’ve worked in other, larger companies that had all this infrastructure in place and my responsibility was to focus on my small part. I didn’t need to worry about the whole environment. Here, it is interesting to be able to structure it, to find the right approach and the right environment that works best for us and is the most agile.

We have a lot of control over how we want to operate, there is no resistance when it comes to making things as they should be from our perspective and using the solutions that seem to be the best and most up to date.

We are hiring!
Are you interested in using the latest technology to help develop an app with millions of users around the world? Careers.