{"id":2095,"date":"2026-02-23T21:08:34","date_gmt":"2026-02-23T21:08:34","guid":{"rendered":"https:\/\/www.katokane.com\/cm\/?post_type=case-study&#038;p=2095"},"modified":"2026-02-23T22:30:43","modified_gmt":"2026-02-23T22:30:43","slug":"2095","status":"publish","type":"case-study","link":"https:\/\/www.katokane.com\/cm\/case-studies\/2095\/","title":{"rendered":"Modernizing Kroger Search from Lexical to Semantic and Learning-to-Rank"},"content":{"rendered":"\n<h1 class=\"wp-block-heading\">Summary<\/h1>\n\n\n\n<p>I led a search modernization program at Kroger that combined platform strategy, cross-functional execution, and relevance science. We moved from a primarily lexical search system to semantic retrieval and learning-to-rank, while also rebuilding the analytics and experimentation foundation needed to scale decision-making. The program improved conversion and search engagement, and it also changed how the organization operated around search performance.<\/p>\n\n\n\n<div class=\"wp-block-group case-study-outcomes has-surface-background-color has-background is-layout-flow wp-block-group-is-layout-flow\">\n<h3 class=\"wp-block-heading has-l-font-size\">Outcome Highlights<\/h3>\n\n\n\n<div class=\"wp-block-columns case-study-outcomes-grid is-layout-flex wp-container-core-columns-is-layout-9d6595d7 wp-block-columns-is-layout-flex\">\n<div class=\"wp-block-column is-layout-flow wp-block-column-is-layout-flow\">\n<p class=\"case-study-metric\">+5%<\/p>\n\n\n\n<p class=\"case-study-metric-label\">Search add-to-cart conversion (sustained 2023 \u2192 2024)<\/p>\n<\/div>\n\n\n\n<div class=\"wp-block-column is-layout-flow wp-block-column-is-layout-flow\">\n<p class=\"case-study-metric\">+$0.20<\/p>\n\n\n\n<p class=\"case-study-metric-label\">Avg. add-to-cart price (2023)<\/p>\n<\/div>\n\n\n\n<div class=\"wp-block-column is-layout-flow wp-block-column-is-layout-flow\">\n<p class=\"case-study-metric\">$3.3B \u2192 $5B<\/p>\n\n\n\n<p class=\"case-study-metric-label\">Search platform revenue (2021\u20132023)<\/p>\n<\/div>\n<\/div>\n<\/div>\n\n\n\n<h2 class=\"wp-block-heading\">Role and scope<\/h2>\n\n\n\n<p>At Kroger, I was the Senior Manager of Product for Search Backend and Infrastructure. I owned the search platform direction across backend search systems, ranking, analytics, and the infrastructure work needed to improve relevance at scale.<\/p>\n\n\n\n<p>When I stepped into the role, our search experience was still largely lexical. We had some purchase history signals for logged-in customers, but the search system itself was still mostly matching words. It worked for a lot of straightforward queries, but it struggled in the places that matter most in grocery, where customers search in messy, human ways.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">The problem we needed to solve<\/h2>\n\n\n\n<p>The issue was not that search was broken. It was that it was hitting the ceiling of what lexical matching could do on its own.<\/p>\n\n\n\n<p>Customers were typing broad intent, brand terms, partial phrases, category language, and seasonal language. They did not always search with the same words we used in the catalog. They also did not shop in a strictly linear way. A lot of grocery behavior is mixed between planned shopping and impulse decisions, and our search and browse experiences needed to support both.<\/p>\n\n\n\n<p>We also had another practical problem. To build stronger ranking systems, we needed better behavioral data and cleaner instrumentation. We could not jump straight to better ML ranking if the underlying search analytics and tagging were inconsistent.<\/p>\n\n\n\n<p>So the work became bigger than &#8220;launch semantic search.&#8221; It was really a search platform modernization effort with three connected parts:<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>Clean up the data and instrumentation<\/li>\n\n\n\n<li>Improve retrieval with semantic understanding<\/li>\n\n\n\n<li>Improve ranking with behavioral and contextual signals<\/li>\n<\/ol>\n\n\n\n<h2 class=\"wp-block-heading\">How I approached it<\/h2>\n\n\n\n<p>I did not treat this as a one-time algorithm upgrade. I treated it as a system change.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">1) Start with the data foundation<\/h3>\n\n\n\n<p>Before we pushed hard on semantic retrieval and ranking models, I focused on cleaning up search behavioral analytics and tagging. We needed a stronger foundation for the data science team to build on, especially for query2vec and later ranking features.<\/p>\n\n\n\n<p>That work was less visible than a ranking launch, but it mattered a lot. It gave us cleaner behavioral signals, better reporting, and a more reliable way to evaluate whether search changes were actually helping.<\/p>\n\n\n\n<p>I also led an analytics revamp with the analytics and data science teams. We built a centralized Power BI dashboard that gave stakeholders one place to understand how search was performing across the funnel, including search cart conversion, experimentation performance, monetization conversion, and personalization conversion. That changed how we worked because it moved conversations away from opinions and toward shared metrics.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">2) Introduce semantic retrieval in a way the business could trust<\/h3>\n\n\n\n<p>Once the data foundation was in better shape, I worked closely with data science on semantic retrieval. We built the path toward vector-driven semantic understanding by supporting query2vec and prod2vec so we could improve candidate generation.<\/p>\n\n\n\n<p>This was an important shift. Instead of relying only on lexical matching to decide what products should even be considered, we now had a semantic layer that could better understand the relationship between what a customer typed and the products we might show.<\/p>\n\n\n\n<p>I was careful about how we rolled this out. Grocery search is high-volume and high-trust. We could not treat semantic retrieval like a research project. The implementation needed to fit the reality of the platform, the user experience, and the business expectations.<\/p>\n\n\n\n<p>That meant we introduced it in a controlled way and made sure we still preserved the behaviors we knew customers depended on, especially for clear exact-intent searches.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">3) Improve ranking with behavioral and contextual signals<\/h3>\n\n\n\n<p>Retrieval was only part of the story. Once we had stronger candidate generation, we needed to improve result ordering.<\/p>\n\n\n\n<p>I worked with the data science team on iterative ranking improvements, including LSTM-based ranking, and expanded the set of features used in ranking. We incorporated behavioral signals and contextual features, including seasonal signals, to better reflect how people actually shop.<\/p>\n\n\n\n<p>This work showed up across multiple surfaces, not just one list page. We implemented and improved ranking behavior in Grid, Carousels, and Browse. That mattered because customers do not experience search as a single endpoint. They move across surfaces, and we needed consistency in how relevance performed.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">How we evaluated success<\/h2>\n\n\n\n<p>I wanted the team to be able to answer a simple question at every stage: is this actually improving the shopping experience in a way that matters?<\/p>\n\n\n\n<p>We evaluated success through a mix of experimentation, behavioral metrics, and operational monitoring.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Product and business metrics<\/h3>\n\n\n\n<p>We tracked search performance using outcomes that the business cared about:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>add-to-cart conversion from search<\/li>\n\n\n\n<li>average add-to-cart price per product<\/li>\n\n\n\n<li>search visits and search engagement<\/li>\n\n\n\n<li>search visits with cart add<\/li>\n\n\n\n<li>internal search usage trends<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Experimentation and iteration<\/h3>\n\n\n\n<p>We used iterative experimentation to improve ranking and relevance over time, not just a one-time launch evaluation. The goal was to treat relevance work as an ongoing product capability, with a loop for learning and tuning.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Cross-functional visibility<\/h3>\n\n\n\n<p>The dashboard work was part of evaluation too. It gave product, analytics, data science, and leadership a shared way to look at performance and make decisions. That made it much easier to align on what to improve next.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Outcomes<\/h2>\n\n\n\n<p>The search modernization work drove measurable gains in both customer behavior and business performance.<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>5% lift in search add-to-cart conversion from semantic retrieval and ML ranking improvements<\/li>\n\n\n\n<li>$0.20 increase in average add-to-cart price per product in 2023<\/li>\n\n\n\n<li>Search platform revenue grew from $3.3B to $5B between 2021 and 2023, supported by ongoing search optimization and experimentation<\/li>\n\n\n\n<li>15.05% YoY increase in search visits<\/li>\n\n\n\n<li>13.33% YoY increase in search visits with cart add<\/li>\n\n\n\n<li>4.0% YoY increase in internal searches<\/li>\n<\/ul>\n\n\n\n<p>The biggest change, though, was not just one metric. It was that we moved the search platform into a different mode of operation. We were no longer relying mostly on lexical tuning and one-off changes. We had a stronger foundation for semantic retrieval, ranking science, and continuous improvement.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Leadership and org capability impact<\/h2>\n\n\n\n<p>A big part of my role on this work was making sure the organization could support this kind of product and data science work over time.<\/p>\n\n\n\n<p>I was not only driving the roadmap. I was also building the operating model around it.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Strengthening how teams worked together<\/h3>\n\n\n\n<p>Search relevance work sits across product, backend engineering, frontend, analytics, data science, and merchandising. I spent a lot of time aligning those groups around shared goals and shared metrics so we could make better decisions faster.<\/p>\n\n\n\n<p>That cross-functional alignment mattered as much as the model work. Without it, the program would have turned into disconnected efforts.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Building analytics and decision infrastructure<\/h3>\n\n\n\n<p>The centralized Power BI reporting work helped create a shared language for performance. It reduced confusion, improved decision quality, and gave leadership a clearer view into search, personalization, monetization, and experimentation outcomes.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Building AI\/ML product capability in the org<\/h3>\n\n\n\n<p>In parallel with platform work, I also invested in organizational capability:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>I established and grew an AI\/ML Guild with 15+ sessions and workshops<\/li>\n\n\n\n<li>I partnered with external groups, including Elasticsearch and Microsoft, to support learning and development<\/li>\n\n\n\n<li>We grew participation to 220+ cross-org members by early 2024<\/li>\n\n\n\n<li>I built an AI\/ML Resource Center with practical materials on platforms, use cases, MLOps, infrastructure, team building, and responsible AI<\/li>\n<\/ul>\n\n\n\n<p>That work helped raise the baseline across the organization. It gave teams a better way to learn, talk about, and execute AI\/ML product work beyond any one project.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">What I would carry forward from this work<\/h2>\n\n\n\n<p>This project shaped how I lead search and AI product systems now.<\/p>\n\n\n\n<p>The core lesson for me was that relevance improvements do not come from one model decision. They come from treating search as a product system and improving the full loop:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>data quality and instrumentation<\/li>\n\n\n\n<li>retrieval quality<\/li>\n\n\n\n<li>ranking quality<\/li>\n\n\n\n<li>experimentation<\/li>\n\n\n\n<li>cross-functional decision-making<\/li>\n<\/ul>\n\n\n\n<p>That is what made the work sustainable, and that is what made the business impact durable.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Summary I led a search modernization program at Kroger that combined platform strategy, cross-functional execution, and relevance science. We moved from a primarily lexical search system to semantic retrieval and learning-to-rank, while also rebuilding the analytics and experimentation foundation needed to scale decision-making. The program improved conversion and search engagement, and it also changed how [&hellip;]<\/p>\n","protected":false},"featured_media":2108,"template":"","case-study-tag":[27,28,40,39,29],"class_list":["post-2095","case-study","type-case-study","status-publish","has-post-thumbnail","hentry","case-study-tag-ai","case-study-tag-search","case-study-tag-search-ranking","case-study-tag-search-relevance","case-study-tag-strategy"],"_links":{"self":[{"href":"https:\/\/www.katokane.com\/cm\/wp-json\/wp\/v2\/case-study\/2095","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.katokane.com\/cm\/wp-json\/wp\/v2\/case-study"}],"about":[{"href":"https:\/\/www.katokane.com\/cm\/wp-json\/wp\/v2\/types\/case-study"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.katokane.com\/cm\/wp-json\/wp\/v2\/media\/2108"}],"wp:attachment":[{"href":"https:\/\/www.katokane.com\/cm\/wp-json\/wp\/v2\/media?parent=2095"}],"wp:term":[{"taxonomy":"case-study-tag","embeddable":true,"href":"https:\/\/www.katokane.com\/cm\/wp-json\/wp\/v2\/case-study-tag?post=2095"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}