Case Study

How DoiT Empowered mscope with Data-Driven Insights and Efficient Experimentation

Client
mscope
Industries

Financial Services & Insurance

Technologies
Amazon S3, Amazon SageMaker, Amazon Web Services
Region
EMEA, South EMEA
Country
Spain

Meet mscope

mscope.tech is a dynamic early-stage startup dedicated to maximizing the potential of data for informed investment decision-making. With a clear vision, its primary objective is to unlock and harness the value hidden within vast information. By leveraging cutting-edge technologies such as machine learning and AI, mscope aims to revolutionize data analysis. It focuses on developing state-of-the-art methodologies to summarize documents and extract crucial features from web-scraped data. By employing these advanced techniques, mscope empowers investors to make more accurate and strategic choices, propelling it toward tremendous success in the ever-evolving landscape of investment opportunities.

The challenge

As an early-stage startup, mscope faced several challenges due to a need for in-house expertise in machine learning. It needed to figure out how to establish the necessary infrastructure and workflows for training and deploying models on its data. Specifically, it required assistance with the following use cases:

On one hand, mscope needed to generate text in a common language (English) from multi-language web-scraped dataa. This involved the usage of large language models (LLMs) to generate comprehensive company information in English.

On the other hand, mscope required a sophisticated multi-level classification system for labeling companies based on their role in the supply chain and the nature of their products. This involved implementing classification algorithms to accurately categorize companies as producers, consumers, or based on specific product characteristics like organic food production.

By addressing these challenges, mscope aimed to enhance its data analysis capabilities and provide valuable insights for better investment decision-making.

The solution

mscope engaged with DoiT’s expert services to help consult on and address its machine learning needs. As mscope lacked expertise in this specialization, DoiT provided guidance and solutions in various areas.

Firstly, DoiT helped mscope with budgeting and estimating costs for the necessary AWS services. This ensured that mscope could plan its financial resources effectively.

Secondly, DoiT designed a solution for fine-tuning transformers-based models such as Mistral 7B , backed by HuggingFace framework. DoiT engineers created an architecture and workflow that met mscope’s specific requirements.

In the current setup, mscope uses a long-running EMR cluster with several defined steps. The first step involves cleaning and normalizing the web-scraped and RDS data. After this, they create a prompt in the next step. Next, mscope uses the generated prompts to extract relevant data by invoking the SageMaker Jumpstart endpoint in the subsequent step.

But, this architecture was quite expensive and complex to maintain. Therefore, DoiT recommended that mscope opt for sagemaker batch transform to execute these different steps and invoke the sagemaker model asynchronously. Additionally, they could orchestrate these steps together with Step functions.

The proposed architecture is as follows:

In addition, DoiT provided detailed cost estimates for the proposed architecture, helping mscope plan its budget effectively.

Finally, DoiT’s relationship with AWS allowed it to get early access to Amazon Bedrock on mscope’s behalf, which enabled the company to experiment with the latest LLMs at lower costs.

Overall, DoiT’s unrivaled expertise and guidance provided mscope with a customized machine learning roadmap. This roadmap covered data preparation, model fine-tuning, deployment, and budgeting. With DoiT’s assistance, mscope was able to leverage machine learning effectively and create value from its data.

The result

DoiT’s solution had a profound impact on mscope, yielding significant results. By recommending the utilization of advanced models such as LLMs, mscope enhanced its preprocessing of raw multilingual web scraping data, leading to an improvement in the mscope classifier algorithm.

Additionally, DoiT assisted mscope in leveraging LLMs, optimizing processing times, and reducing costs. With DoiT’s expertise in AWS, mscope received assistance in resolving issues related to Sagemaker endpoints and EMR cluster usage.

In summary, DoiT’s expertise and solution provided mscope with valuable insights from multilingual data and streamlined experimentation. These outcomes empowered mscope to improve and optimize its product. By harnessing these capabilities, mscope maximized the value derived from their data and cloud services, leading to improved investment outcomes and enhanced business performance.

What's Next?

Based on the success of the collaboration between, mscope would like to continue this partnership to build a new company classifier. Having enhanced the web scraping input data, its focus will now shift to further improving the mscope companies’ classification.

Marion Roussel, VP Data at mscope
“We are grateful for the tremendous support provided by DoiT. As an early-stage startup navigating complex challenges in machine learning new techniques, DoiT’s expertise was instrumental. From budgeting AWS services to architecting a tailored solution, their guidance empowered us to harness advanced technologies effectively. DoiT’s solution delivered profound results, enabling us to use our web scraping data more efficiently and gain more insight from them, improving our classification algorithm. Their collaborative approach, technical proficiency, and commitment to success significantly contributed to our growth and success in the dynamic investment landscape.”

Learn more about how DoiT can help you

Latest case studies

Schedule a call with our team

You will receive a calendar invite to the email address provided below for a 15-minute call with one of our team members to discuss your needs.

You will be presented with date and time options on the next step