Zillow: AI-Driven User Memory System for Dynamic Real Estate Personalization

Company

Zillow

Title

AI-Driven User Memory System for Dynamic Real Estate Personalization

Industry

Tech

Link

https://www.zillow.com/tech/designing-ai-driven-user-memory-for-personalization/

Year

2025

Summary (short)

Zillow developed a sophisticated user memory system to address the challenge of personalizing real estate discovery for home shoppers whose preferences evolve significantly over time. The solution combines AI-driven preference profiles, embedding models, affordability-aware quantile models, and raw interaction history into a unified memory layer that operates across three dimensions: recency/frequency, flexibility/rigidity, and prediction/planning. This system is powered by a dual-layered architecture blending batch processing for long-term preferences with real-time streaming pipelines for short-term behavioral signals, enabling personalized experiences across search, recommendations, and notifications while maintaining user trust through privacy-centered design.

## Overview Zillow has built a comprehensive AI-driven user memory system designed to personalize the real estate discovery experience for home shoppers navigating complex, months-long decision journeys. The case study, published in July 2025, describes how Zillow addresses a fundamental challenge in personalization: home shoppers' preferences are highly dynamic, with users often starting their search in one city or property type and shifting to entirely different criteria weeks later. Traditional static personalization systems struggle with this fluidity, which led Zillow to conceptualize personalization as a "living memory" rather than a simple catalog recommendation system. While the article emphasizes "AI-driven" approaches and mentions semantic understanding through embeddings, it's worth noting that the text does not explicitly detail the use of large language models (LLMs) or generative AI in the traditional sense. The focus is primarily on machine learning-based personalization systems, embedding models, and preference profiling. However, the architecture and infrastructure described—particularly around user memory, context preservation, and dynamic adaptation—represent important LLMOps considerations that would be relevant for any production AI system, including those incorporating LLMs for understanding user intent or generating personalized content. ## The Problem Space Zillow recognizes that modern home shoppers expect platforms not just to display listings but to actively understand and guide them through their journey. The core challenge is that user preferences are neither static nor simple. A buyer might browse condos in Seattle initially but three weeks later focus on townhomes in a different state entirely. These shifts can occur due to market constraints, evolving affordability understanding, life changes, or simply exploration. Static personalization systems that rely on fixed user profiles or simple collaborative filtering cannot adapt quickly enough to these dynamic journeys, leading to irrelevant recommendations and frustrated users. The company frames this as requiring "richer, more intelligent user memory" that goes beyond logging clicks and page views. They need a system that understands context, preserves meaningful patterns over time, and adapts gracefully to changes in behavior and intent while avoiding stale assumptions. ## The User Memory Concept Zillow's approach centers on what they call "user memory"—an evolving, context-rich understanding of what each person values during their home shopping journey. This memory operates across three conceptual dimensions that guide the system architecture: The first dimension is **recency and frequency**, distinguishing between what users care about right now versus what they've consistently valued over time. This requires the system to weight recent interactions differently from historical patterns while still preserving stable long-term preferences. The second dimension is **flexibility and rigidity**, identifying which preferences are absolute must-haves (like needing three bedrooms) versus areas where users are open to exploration (perhaps being flexible on exact location or home type). This nuance is critical for recommendation systems to avoid either being too narrow or too broad. The third dimension is **prediction and planning**, anticipating how preferences are likely to shift over time based on behavioral patterns and market dynamics. This forward-looking aspect helps the system proactively adapt rather than merely reacting to changes. This memory layer powers multiple personalization surfaces across Zillow's platform, including search ranking, homepage recommendations, push notifications, and user-specific experiences. ## Technical Architecture: Dual-Pipeline Approach Zillow implements user memory through a sophisticated dual-layered architecture that combines batch processing with real-time streaming pipelines. This hybrid approach reflects a thoughtful balance between computational efficiency and responsiveness. **Batch pipelines** handle long-term preference modeling. These systems operate on extended observation windows and can employ higher-latency, more complex algorithms since long-term preferences don't change hour by hour. Batch jobs run daily to aggregate user activity across multiple sessions and generate stable, preference-aware representations covering features like price range, bedroom count, location preferences, and home type affinities. The batch approach is efficient for capturing durable patterns that emerge over weeks or months of user behavior. **Real-time streaming pipelines** capture short-term signals and transient shifts in behavior. These pipelines ingest behavioral signals—views, saves, searches, filter applications—and update user state within seconds. The streaming architecture uses what Zillow calls their "Online Data Platform tables" for near-instant access, enabling lightweight in-session models that complement the batch-generated profiles. Event-driven triggers based on these real-time signals power recommendations, alerts, and immediate personalization adjustments. The architecture doesn't treat these as separate systems but rather **leverages them side-by-side** for a seamless experience. Long-term profiles provide stability and shape relevance across time, while real-time signals tune personalization moment-to-moment. Interestingly, Zillow notes that the same modeling logic often runs in both pipelines with different tuning parameters—a pragmatic approach that maintains consistency while optimizing for different latency and accuracy tradeoffs. The signals are processed through separate pipelines but consumed together by downstream systems, allowing the platform to preserve stability while responding to behavioral shifts. ## Building Blocks: Components of User Memory Zillow's user memory system comprises several distinct but complementary components, each serving specific purposes in the personalization stack. **Preference profiles** form the foundation, summarizing what kinds of listings each user engages with across structured dimensions like price, location, number of bedrooms, and home type. These profiles are built from recent user interactions and capture probability distributions over preference values. For example, a profile might show that a user interacts most frequently with listings priced between $750K-$900K, demonstrates strong interest in homes with 3+ bedrooms, and focuses on specific ZIP codes. These structured profiles become input features for downstream recommendation models, powering personalized ranking and targeted experiences. **Recency weighting** is applied to address the temporal dynamics of user preferences. Zillow uses decay functions to give more recent consistent interactions higher weight in the profile calculation. This makes models responsive to shifts—surfacing townhomes in Oakland if that's what a user started browsing this week, even if they were looking at condos in San Jose last month. The decay is currently based on absolute recency but Zillow notes it can be tuned at multiple levels based on position in the interaction sequence. These tunable decay strategies enable construction of both short-term and long-term preference representations used in parallel across different personalization surfaces. The company sees opportunities to more deeply blend these temporal signals in future iterations, creating unified profiles that dynamically balance stability with recency. **Affordability-aware quantile models** address a limitation of pure preference profiles: they can miss nuance in constraints, particularly financial constraints. While preference profiles capture what users like, affordability models help answer questions about what users can realistically consider. These models assess whether users are browsing within or beyond their likely budget, how wide their price tolerance is, and what listings might stretch their range but still feel relevant. Zillow emphasizes they understand user preferences based on both app/website interactions and what users can afford, creating a "360-degree view" of needs. They also mention exploring how other context sources, such as preferences shoppers share with their agents, can further enrich this understanding. This affordability-aware approach is especially valuable in constrained markets, helping recommend listings that fit both user preferences and financial realities. **Embedding models** complement the structured preference profiles by capturing nuanced, unstructured features that don't fit neatly into categorical buckets. Users often have interests toward features like textured walls or specific types of trees in the backyard—aspects that cannot be effectively captured by structured profiles alone. Embeddings provide semantic understanding, fill gaps in preference modeling, and can surface "surprising-but-relevant" options that might not match explicit filters but align with latent user tastes. While the article doesn't detail the specific embedding architecture or training approach, the mention of semantic understanding suggests these models learn representations from listing descriptions, images, or user behavior patterns. This is one area where LLM-based approaches could potentially play a role, though the text doesn't explicitly state this. **Raw interaction history** serves a different purpose than the modeled representations. While models help infer and generalize preferences, there are moments where the most useful memory is simply an accurate replay of what happened. Zillow retains timestamped event sequences for each user, including views, saves, searches, and filter applications. This raw behavioral history supports experiences that depend on continuity and transparency, such as "recently viewed homes" features that allow users to revisit without retracing their steps. This component acknowledges that not all personalization needs to be inferred—sometimes explicit history is the most trustworthy signal. ## Production Considerations and LLMOps Relevance While the case study doesn't explicitly describe LLM deployment or generative AI applications, several aspects of Zillow's architecture represent important LLMOps considerations applicable to any production AI system: **Feature engineering and signal processing** at scale is a central challenge. Zillow processes behavioral signals from millions of users across multiple touchpoints, requiring robust data pipelines that handle both batch and streaming workloads. The ability to extract meaningful features from raw interactions, apply appropriate transformations (like recency weighting), and maintain consistency across different temporal scales is critical infrastructure work that would underpin any LLM-based personalization system. **Model serving architecture** must support both high-throughput batch inference and low-latency real-time inference. The dual-pipeline approach reflects this reality—some personalization decisions can tolerate daily update cycles, while others require sub-second response times. This mirrors challenges in LLM deployment where you might use different serving strategies for offline embedding generation versus online inference. **Context management and memory** is explicitly central to Zillow's approach. The concept of maintaining evolving user memory that balances recency with stability, structured with unstructured signals, and explicit with inferred preferences is highly relevant to LLM-based systems. Many modern LLM applications, particularly conversational agents or personalized assistants, face similar challenges in maintaining user context across sessions and adapting to changing needs. **Evaluation and quality assurance** isn't detailed in the article, but the existence of multiple parallel representations (short-term vs. long-term profiles, structured preferences vs. embeddings) suggests Zillow likely has sophisticated evaluation frameworks to measure relevance, diversity, and user satisfaction across different personalization approaches. This is a critical LLMOps concern—how do you validate that your personalization system is actually helping users rather than reinforcing biases or creating filter bubbles? **Privacy and compliance** is given explicit attention. Zillow emphasizes that their personalization infrastructure is designed to respect user choices around data usage, comply with regulations like CCPA, protect identities through secure data handling, and support transparency and control. They frame this as "trust-centered personalization," recognizing that effective personalization depends on user trust. For LLM-based systems, privacy considerations are even more complex given concerns about training data memorization, personally identifiable information exposure, and the potential for models to reveal sensitive information through inference. Zillow's emphasis on privacy-aware data handling and user control represents best practices applicable to any production AI system. **System integration and orchestration** is implied throughout the case study. The user memory layer feeds into multiple downstream systems—search ranking, recommendations, alerts, various personalization surfaces. This requires careful orchestration to ensure consistent user experiences across touchpoints while allowing individual systems to optimize for their specific objectives. In LLMOps contexts, this relates to challenges around prompt management, context injection, and ensuring that different LLM-powered features maintain coherent understanding of user state. ## Limitations and Critical Assessment While the article presents Zillow's approach as innovative and user-centric, several aspects deserve critical examination. First, the text is clearly promotional in nature, emphasizing benefits and innovation while providing limited detail on challenges, failures, or tradeoffs. There's no discussion of cases where the system fails, how often recommendations miss the mark, or what error rates look like. Second, despite the title's emphasis on "AI-driven" memory and the article's framing around semantic understanding, there's limited technical detail on the actual AI/ML models used. Are the embeddings generated by transformer models, collaborative filtering approaches, or some hybrid? What specific architectures power the preference profiles? The lack of technical depth makes it difficult to assess the true sophistication of the approach versus more conventional recommendation systems with updated terminology. Third, the article doesn't explicitly demonstrate that this is fundamentally different from well-established techniques in recommender systems. Recency weighting, combining collaborative and content-based filtering, using embeddings for semantic similarity—these are established practices in industrial recommendation systems. While Zillow's integrated approach and scale are impressive, it's unclear whether this represents a significant technical advance versus solid execution of known techniques. Fourth, there's no discussion of the cold start problem or how the system handles new users without behavioral history. Real estate is often an infrequent transaction—many users might visit once or twice before making a decision. How does the memory system work with sparse data? Fifth, the evaluation methodology is absent. How does Zillow measure whether their personalization actually helps users make better decisions? Are users finding homes faster? Are they more satisfied? Do they feel the recommendations are relevant? Without concrete metrics or A/B testing results, it's difficult to assess the actual impact. Finally, the privacy and trust discussion, while laudable, is somewhat superficial. What specific mechanisms ensure privacy? How is the tension between personalization (which requires data collection and analysis) and privacy (which limits data use) actually resolved? What choices do users have, and how transparent is the system about what data is used and how? ## Future Directions and Open Questions Zillow indicates several areas for future development. They mention opportunities to "more deeply blend" temporal signals, creating unified profiles that dynamically balance stability with recency rather than maintaining separate short-term and long-term representations. This suggests current limitations in how the dual-pipeline architecture integrates signals. The company also notes exploring how preferences that shoppers share with their agents could enrich the user understanding. This raises interesting questions about multi-modal personalization that incorporates explicit user statements alongside behavioral signals. This is an area where LLMs could potentially play a significant role—understanding natural language preferences, extracting structured information from agent conversations, or even generating personalized search queries or listing descriptions based on stated preferences. The emphasis on making personalization "feel more intuitive and human" suggests ambitions beyond current capabilities. This could indicate future integration of conversational interfaces, natural language search, or more sophisticated explanation mechanisms—all areas where LLMs excel. However, several questions remain unanswered: How does the system handle conflicting signals, such as when browsing behavior diverges from saved listings? How does it distinguish between serious intent and casual browsing? How does it account for household decision-making where multiple people with different preferences might use the same account? These are complex challenges that any personalization system, whether ML-based or LLM-enhanced, must address. ## Conclusion Zillow's user memory system represents a thoughtful, production-scale approach to personalization in a complex domain with highly dynamic user preferences. The dual-pipeline architecture combining batch and real-time processing, the multi-component memory layer blending structured profiles with embeddings and raw history, and the emphasis on privacy and trust all reflect mature thinking about production AI systems. However, while the article frames this as AI-driven innovation, the technical details provided don't clearly demonstrate use of cutting-edge AI techniques like LLMs or generative models. Instead, this appears to be a well-executed implementation of established recommendation system techniques applied thoughtfully to the real estate domain's unique challenges. This doesn't diminish its value—production systems that reliably serve millions of users at scale while respecting privacy are impressive achievements regardless of whether they use the latest AI buzzwords. For practitioners interested in LLMOps, the case study offers valuable lessons in architecture patterns (batch/streaming hybrids), context management (maintaining evolving user memory), integration challenges (serving multiple downstream systems), and the critical importance of privacy and trust in consumer-facing AI applications. These considerations apply equally to LLM-based systems, even if Zillow's specific implementation may not rely heavily on LLMs. As Zillow continues evolving this infrastructure, integration of language models for understanding user intent, generating personalized content, or powering conversational interfaces would be natural extensions of their current foundation.

Start deploying reproducible AI workflows today