Data Science
Use Data Science to unlock the potential of your data
Our team of Data Scientists supports companies not only in their strategic Data Science projects, but also more operationally as an external Data Lab at the service of specific business issues, using Data to generate a POC, a model or a tool.
They trust us
The challenges related to data sciences
Data science covers several disciplines with the objective of bringing out from this data, trends, patterns, connections and correlations, predictions, etc. The immense possibilities offered by Data science should not overshadow the challenges associated with it. These challenges include:
The promises of Data are great, especially in health and for the healthcare industries: discovery of new therapeutic avenues, acceleration and greater fairness of clinical trials, personalized medicine and optimized patient follow-up, etc.
To help achieve these ambitious goals, Data science needs accessible quality data and above all to interface upstream with the owners of the Data and downstream with the users of the conclusions drawn. Valuing data for companies therefore requires transformations towards a more data-driven model to make the most of the work of Data Scientists.
What opportunities can Data science services generate for my business? Are we equipped and structured today to create this value?
This crucial step in any Data science process determines the success of the analysis at the technical level but also its value at the interpretative level. It may require making choices to increase the signal-to-noise ratio. In particular, the field of Data science focused on the analysis of textual data, Natural Language Processing, or NLP, may require particularly significant data cleaning depending on the source used (example: collecting information on social networks to detect and interpret misspelled words or abbreviations).
What data is rich enough for its analysis to add value? How to extract value from the internal or external databases that we have?
The quality and representativeness of the input data is key to being able to draw relevant conclusions. In particular, poorly balanced or “imbalanced” data can bias learning. If we seek to train an algorithm to classify images of cats and dogs on the basis of 1000 images of cats and 100 of dogs, the notion of a greater frequency of occurrence of cats will have an impact on the classification of new images.
This imbalance can be easy to identify if it concerns the main detection objective but much less so if it is one element among others, for example an over-representation of kittens among the images. Historical databases may be biased, such as clinical trial databases in which Causasian men are over-represented compared to the ratio of the general population. The aim is to identify these biases and correct them by reducing the size of the over-represented sample (undersampling) or by artificially increasing that of the under-represented sample (oversampling).
For example, how can clinical centers be better targeted to achieve good representativeness of the population included?
In the implementation of machine learning models, another technical issue is related to not over-adapting the model to the existing dataset, which is called the phenomenon of overfitting. Indeed, the quality of the model is tested by various indicators that account for the reliability of the prediction such as accuracy (correct detection rates), sensitivity (capacity to correctly detect the “true ones”) and specificity (capacity to detect the “false” ones).
Trying to maximize these indicators can lead to including a lot of variables in the analysis or using increasingly complex models. It is important to keep a sample of your dataset not to train the model but to test it. Knowing that training data is often more homogeneous than real data, it is also important to limit the complexity of the machine learning model chosen to the minimum required. Assembling the results of several models is also a technique to limit the biases inherent in each of the models.
How to adapt a forecast model to anticipate scenarios with events that have never happened in the past?
Finally, the interpretability of model results is a crucial issue. Some of the most powerful machine learning models, deep learning models, do not make it possible to go back to the parameters that allowed the machine to propose, for example, a given classification. It can sometimes be better to have a lower precision of the model but to be able to explain it. For example, if you want to create customer segmentation, it is interesting to know what parameters define the segments in order to then be able to create appropriate interactions and content.
Finally, rigor in the interpretation of the data must be imposed and must be transmitted to the recipients of the results. In particular, it is often tempting to interpret the correlation between two variables as a causal link from one to the other, a conclusion which must most often be based on business knowledge or specific studies in addition to analyses already conducted.
How to make the results of Data science analyses usable and understandable for internal distribution?
How we support you in your projects related to data sciences
For over 25 years, Alcimed has been supporting its clients in their innovation and new business development projects.
With this strong business experience and competences in Data science consulting and a dedicated team, Alcimed is positioned as an external Data Lab for companies, serving your business issues, and aiming via Data projects to generate a POC, a model or a tool. This consulting approach can be just as much part of a project as a market study or a full mission.
The data used can be your internal data, external data in OpenData, private or obtained by Web scraping. This can be encrypted data, text, image, etc. Finally, we work by using the full range of our Data Scientists’ tools to carry out these projects.
Beyond these concrete achievements and solutions, our consulting services can also be part of a broader strategic framework: implementation of a Data-driven strategy and culture in your company, creation of a data-driven innovation process, etc.
What they say
"There are three points I particularly appreciated when working with your consultants: the relevance of the teams, the structuring of the data and the commitment of the teams. Alcimed never gives up! "
Philippe Caillat
Marketing Director
"Together between Alcimed and Nestlé Health Science, we made informed decision on where the best opportunities were and why and how to select the most valuable assets in our investigation."
Bernard Cuenoud
Global Head of Research and Clinical Development
Examples of recent projects carried out for our clients in data sciences
Creation of a customer engagement measurement index for a pharmaceutical manufacturer
Alcimed supported a leader in the pharmaceutical industry in the construction of an aggregate indicator for measuring customer engagement.
Using its client’s internal databases, the Alcimed team devised an aggregation method to take into account the impact of all the company’s interactions with healthcare professionals.
The indicator makes it possible to measure the evolution of customer engagement at the individual level or by customer profile over time to better measure the impact of events and marketing campaigns of the company, for example.
Detection of weak signals in a medical communication anticipation logic, for a pharma player
Alcimed supported an international pharmaceutical player in the definition, design and implementation of a tool for visualizing weak medical signals (doctor concerns), allowing our client to anticipate medical communication issues.
Our team implemented NLP techniques and advanced statistical analysis of textual elements allowing automatic detection of signals and their escalation to specific product teams. We also supported our client in the deployment of this new approach internally.
Prediction of the number of building permits for an industrial leader in the construction industry
In order to support our client, an industrial leader in the construction industry, in anticipating its volume of business, Alcimed developed a machine learning algorithm to predict, before they are all officially referenced by local administrations, the total number of building permits actually filed in the current month based on historical public data.
This consulting project thus enabled our client to anticipate its sales projections and adapt many of the company’s upstream activities accordingly.
Identification of the dissemination of key themes in networks of Digital Opinion Leaders
Alcimed’s team of Data scientists has set up a machine learning model to conduct an unsupervised analysis of the topics mentioned on Twitter in connection with the American Congress on Diabetes, ADA2021, as well as Twitter user communities communicating on this topic.
Our analysis, published in our Data use case 2, makes it possible to highlight 2 communities concerned mainly by different topics with the visualization of networked data.
Comparison of the footprint of different pharmaceutical players in European professional organizations
Alcimed worked with a pharmaceutical industry player to understand their footprint among healthcare professional associations in Europe.
In 15 of our client’s key markets, we aggregated and consolidated information available from public sources (association websites, LinkedIn, press releases) into a database, identifying the main professional associations, the working groups attached to them, and the elected members of these associations.
This exploration enabled our customer to gain a clear view of the existing associations in its key markets, its own and other players’ footprints in terms of representation within these associations, and finally to have an action plan concerning the reinforcement of their current position in certain associations or the interest of entering new ones.
Identification of trends in the dietary supplements market for a food industry player
Alcimed helped a player in the food industry understand the needs and expectations of ingredients from the point of view of its customers and end consumers in the dietary supplements market.
Based on a database of over 1,000 dietary supplement launches over the last 5 years, describing ingredient lists and product forms, we structured a comprehensive dashboard presenting, in particular, emerging trends in new ingredients, their most common combinations, and new galenic forms.
Thanks to this dashboard, our customer now has a dynamic support system consolidating all the knowledge acquired through recent product launches, as well as the key lessons enabling him to work on the expansion strategy for his ingredients offering in the dietary supplements market.
You have a project?
To go further
Healthcare
Computer Vision in Healthcare : the applications and challenges of this new AI solution
Computer Vision is making its way into the healthcare sector, with a variety of medical AI solutions. But how is it being used, and what are the challenges ahead?
Cross-sector
Data use case #1: Decipher data to rethink the customer engagement model in dermatology
How can a better understanding of the evolution of medical demographics and healthcare demand allow pharmaceutical companies to adapt their customer engagement models?
Cross-sector
Data Use Case #2: Deciphering the mechanisms of online information sharing in a therapeutic area
On social networks, who are the influential actors in a given pathology on a given theme? How is the online community organized on these topics? Discover how to analyze online data to answer more ...
Founded in 1993, Alcimed is an innovation and new business consulting firm, specializing in innovation driven sectors: life sciences (healthcare, biotech, agrifood), energy, environment, mobility, chemicals, materials, cosmetics, aeronautics, space and defence.
Our purpose? Helping both private and public decision-makers explore and develop their uncharted territories: new technologies, new offers, new geographies, possible futures, and new ways to innovate.
Located across eight offices around the world (France, Europe, Singapore and the United States), our team is made up of 220 highly-qualified, multicultural and passionate explorers, with a blended science/technology and business culture.
Our dream? To build a team of 1,000 explorers, to design tomorrow’s world hand in hand with our clients.
Data science is a fairly broad field that aims to give meaning to raw data.
To create this meaning, Data science covers several disciplines with the objective of bringing out from this data, trends, patterns, connections and correlations, predictions, etc. To do this, Data science uses a wide variety of tools and techniques such as the development of algorithms, applied mathematics and advanced statistics, to artificial intelligence, to produce different types of models. The latter can be determined or learners, thanks to machine learning which allows in a supervised or unsupervised way to analyze and predict data.
Data science is a specific field in the world of data and the Data Scientist is different from the profiles of Data Analyst, Data Engineer, etc. Indeed, Data science process and services require having accessible data, which can be implemented in large organizations by Data Architects or Data Engineers who will structure systems and databases. Making this data accessible is often the essential first step in a Data science consulting project.
There are different standard objectives according to Data science approaches:
- The analysis of the links between the different variables, the search for recurring patterns and statistical anomalies, makes it possible to find associations and correlations and to identify the strongest, but also to group and segment the data to, for example, identify sub-populations in study groups or create customer behavior personas.
- Regression and classification make it possible to predict over time or to estimate beyond the available data the value of a variable, such as the number of hospitalizations linked to a pathology, the membership of a new data point to a category, for example the prediction of the acceptance of a vaccine according to the patient profile or the probability of a certain diagnosis based on medical and radiological data.
The difference between the work of Data Analysts and Data Scientists is mainly based for the latter on the use of “Big data” and the creation of complex models to carry out the analyses.
This difference can be summed up by 5 major “V” concepts:
- Volume and Velocity: data is obtained in large numbers and is accumulated by the company at such a speed that it cannot be used. For example, a large number of performance indicators, collected during a marketing campaign, are reported in templates which should allow business experts to learn from them. However, without the help of Data science, this data is too diffuse and raw to draw lessons on the next action to be taken.
- Variety: unlike business analysts who will often be able to use reports produced annually by agencies and market studies, the data to be used is sometimes very heterogeneous, in the form of structured data or raw data.
- The Veracity of the data: which will be linked to the rigor and reflexes of the Data Scientist in checking the quality of the data and to his or her business knowledge which will allow him or her to attest its credibility and to avoid introducing any interpretation bias.
- The Value: which allows new insights to be obtained to guide scientific, technical, medical or business decisions for the company.
Data science enables the gathering of quantitative insights, such as trends, predictions, etc., by cross-referencing and analyzing raw data sets.
These insights are then used :
- to bring an additional perspective to strategic decision-making,
- to identify new opportunities,
- to create predictive models,
- in internal improvement processes.