## Overview
This case study is drawn from a CTO event presentation featuring speakers from Merantix Momentum (the professional services arm of the Merantix AI ecosystem) and Red Tech Cortex. The presentation focuses on AI applications in pharmaceutical and healthcare contexts, with particular emphasis on human-AI collaborative systems that progressively automate complex annotation and review tasks.
Merantix Momentum describes itself as having completed over 150 AI projects spanning strategy consulting, corporate research, and custom solution development. The healthcare and pharmaceutical industry is highlighted as particularly interesting because it allows for the application of a full toolbox of AI methods across the entire value chain, from drug development to precision medicine to diagnostics.
## Key Case Study: Boehringer Ingelheim Video Analysis System
The primary technical example presented is a system built for Boehringer Ingelheim, a major pharmaceutical company. The business problem centers on drug safety assessment, which requires analysts to review large volumes of video footage showing rodents to identify potential toxicity signals that might indicate whether a substance should proceed to human trials.
The challenge is described as a "find Waldo" problem at scale: analysts must detect very rare behaviors across extensive video content. This is traditionally a sequential, labor-intensive task requiring sustained human attention.
### Technical Architecture and Approach
The solution implements a human-in-the-loop active learning system with the following workflow:
- **Initial Phase**: A human expert describes the specific behavior they are looking for in natural language or through examples
- **Learning Phase**: The system surfaces video snippets where it wants human feedback, progressively learning to recognize the target patterns
- **Autonomous Phase**: Eventually the system can perform annotation completely autonomously for well-learned behaviors
This represents a shift from tools where "humans use AI" to systems where "humans co-create with AI." The architecture is designed to learn from the interaction patterns and feedback loops, not just the final labels.
### Foundation Model Strategy
The presentation emphasizes that these interactive learning systems are not unique to single use cases. Merantix is exploring how to build foundation models that capture patterns across multiple experiments and datasets, enabling faster fine-tuning for new intents. This is presented across three modalities:
- **Tabular Data**: An active research area where Merantix has published a white paper on better representations of structured data, aiming to outperform traditional methods like XGBoost or CatBoost
- **Time Series**: Described as "a very challenging modality" that benefits significantly from learning priors across experiments to reduce annotation requirements
- **AI for Science (especially Physics-based simulations)**: These simulations are computationally expensive, and foundation models can help iterate faster on design exploration for drugs, crystal structures, and similar applications
## Document Processing for Legal and Compliance
A second major application area discussed is document review in legal and compliance contexts. The key constraint here is that errors are not allowed, making full automation inappropriate. The same progressive automation mechanism used for video data is applied to documents:
- The system learns from human examples
- It gradually takes over more tasks as confidence increases
- For anything it cannot handle reliably, humans remain in the loop
- This achieves a "new way of usability experience by combining the advantages of humans and of AI algorithms"
This represents a pragmatic approach to LLMOps where the production system is designed from the start to handle uncertainty and maintain human oversight rather than attempting full automation prematurely.
## Prescriptive AI and Scenario Planning
The presentation introduces a shift toward "prescriptive AI" in pharmaceutical contexts. An example given is a social media listening tool built for pharmacovigilance, where pharmaceutical companies must monitor social media for side effect reports when drugs are released to market.
However, the more interesting trend described is moving beyond trend detection and forecasting toward scenario-based decision planning. This involves:
- Analyzing corner cases specifically
- Building causal understanding of how interventions propagate through systems
- Exploring multiple potential interventions to find those leading to desired outcomes
- Moving from "forecasting one solution" to "exploring many solutions" for better data-based decision making
The speaker notes this is transitioning from theoretical academic approaches to real-world production applications.
## Red Tech Cortex: Enterprise Knowledge Management Platform
The second speaker presents Red Tech Cortex, described as one of the leading European AI platforms with approximately 2 million users. They were ranked 14th in best term software companies and 10th in best AI software globally in 2025.
### Product Architecture
The platform enables users to create AI agents for knowledge work. The core use case is connecting to enterprise data sources (Google Drive, OneDrive, Slack, and other databases) and enabling semantic search and question-answering across this content. Key technical features include:
- **Ongoing indexing**: All connected content is indexed on a continuous basis
- **RAG-based retrieval**: Similar to modern AI assistants but with deeper enterprise integrations developed since 2021
- **Metadata extraction**: Automatically generating metadata from files including inferred department, access levels, and confidentiality classification
- **Provider aggregation**: Allowing companies to use multiple AI providers without switching their knowledge connections and integrations, enabling testing of which provider works best for specific prompts
The platform also supports web search capabilities, with use cases in legal research for searching case law across different sources.
### Production Considerations
The speaker highlights several production-relevant features:
- **Model flexibility**: Companies don't want to be locked into single providers and need the ability to switch between AI backends
- **EU-based processing**: All processing happens within EU jurisdiction, addressing data sovereignty concerns
- **Productivity metrics**: Reported productivity gains of 30 new hire equivalents for every 100 onboarded users, though this metric should be viewed with appropriate skepticism as it comes from marketing materials
Notable customers mentioned include Miele (described as one of the biggest producers in Germany) and various consulting companies.
## Critical Assessment
The presentation provides useful examples of human-AI collaborative systems in production, but several caveats should be noted:
- The specific metrics and results are largely unquantified beyond general claims of improvement
- The Boehringer Ingelheim case study is presented at a high level without detailed performance metrics or validation approaches
- The claims about productivity gains from Red Tech Cortex (30 new hires worth of productivity per 100 users) are marketing claims that would require independent verification
- The presentation is oriented toward selling professional services and products, so the framing naturally emphasizes benefits
That said, the architectural patterns described (progressive automation, human-in-the-loop active learning, semi-automated systems for high-stakes domains) represent pragmatic approaches to deploying AI in production environments where reliability and human oversight remain critical. The emphasis on not aiming for full automation in contexts where errors are not allowed reflects mature thinking about appropriate AI deployment patterns.
## Key Themes for LLMOps
The presentation surfaces several themes relevant to production AI systems:
- **Progressive automation**: Rather than binary automation decisions, building systems that can smoothly transition between human and AI control based on confidence
- **Foundation model strategy beyond LLMs**: Exploring foundation models for tabular data, time series, and scientific applications
- **Human-AI interaction design**: Rethinking UIs to support co-creation rather than simple tool usage
- **Provider abstraction**: Building systems that don't lock into single AI providers
- **Domain-specific safety requirements**: Designing for contexts where errors have serious consequences (drug safety, legal compliance)
The overall message emphasizes that there are many production AI systems beyond the typical "agents and MCP" paradigm that receive most attention, and that careful attention to human-AI interaction design is essential for critical applications.