ZenML

AI-Generated Trip Reports for Outdoor Recreation Guides

Guidesly 2026
View original source

Guidesly, a vertical SaaS platform for outdoor recreation professionals, developed Jack AI to address the challenge of guides spending up to eight hours daily on marketing tasks like website updates, social media posting, and email campaigns. Built on AWS using serverless architecture, Jack AI automatically transforms raw trip data (photos, videos, metadata) into marketing-ready content across websites, social media, and email by combining computer vision for fish species detection, foundation models from Amazon Bedrock for content generation, and contextual prompting for tone alignment. The system reduced content generation time from 13 minutes to 2 minutes, increased content output from under 800 to over 2,500 assets by mid-2025, and helped the five most active guides grow average monthly revenue from approximately $3,000 to over $27,000 (a 9× increase) within six months through improved online visibility and consistent marketing presence.

Industry

Other

Technologies

Overview

Guidesly is a vertical SaaS platform founded in 2019 that serves the outdoor recreation industry, connecting anglers, hunters, divers, and enthusiasts with professional guides, dive shops, and charters. The platform evolved into Guidesly Pro, a comprehensive business management solution handling bookings, payments, websites, client management, and marketing operations. To address the critical challenge of guides spending excessive time on marketing (up to eight hours daily on website updates, social media, and email campaigns), Guidesly developed Jack AI, an automated content generation system built entirely on AWS serverless infrastructure.

Jack AI represents a production-scale LLMOps implementation that automatically transforms raw trip data into polished, multi-channel marketing content. Unlike generic AI tools requiring constant prompting and oversight, Jack AI operates autonomously in the background, activating after each trip to process photos, videos, and metadata into ready-to-publish content across websites, social media platforms, and email campaigns. The system scales automatically to handle hundreds of guides while maintaining consistent quality and authentic voice alignment.

Business Problem and Context

The core challenge centered on marketing operations overhead for small outdoor guide businesses. Guides captured extensive trip media but struggled with several operational bottlenecks. Manually identifying and tagging species, sizes, fishing techniques, and locations in photos proved painstaking and time-consuming. Maintaining authentic voice across generated content while scaling production remained nearly impossible—each guide has unique storytelling styles shaped by local jargon and personal experience that generic AI tools couldn’t capture effectively.

SEO optimization presented another persistent challenge, as producing keyword-rich, locally optimized content consistently requires expertise and time that guides simply don’t have. Managing multiple publishing channels (trip report pages, blogs, Instagram, Facebook, email newsletters) demanded hours of writing, editing, and formatting weekly. Most critically, every hour spent on marketing tasks represented time not spent guiding clients—a direct tradeoff impacting both revenue and customer experience quality.

Architecture and Infrastructure

The Jack AI system leverages a fully serverless, event-driven architecture on AWS designed for automatic scaling and minimal operational overhead. The pipeline orchestration relies heavily on AWS Step Functions to manage complex multi-stage workflows, invoking AWS Lambda functions for each processing stage including data extraction, computer vision inference, media optimization, content generation, and multi-channel publishing.

Trip media ingestion begins when guides upload photos and videos through Amazon API Gateway, which immediately triggers the orchestration pipeline. This event-driven approach ensures fresh content delivery—posts and emails reach audiences while trips remain top-of-mind. All processed artifacts, including optimized media, extracted trip details, and generated marketing assets, are stored using a combination of Amazon S3 for durable, cost-effective media storage and Amazon RDS for PostgreSQL to maintain structured trip and guide data accessible to downstream workflows and reporting systems.

Data Extraction and Contextual Enrichment

The pipeline begins with automatic metadata capture from uploaded media. As guides upload photos and videos, the system extracts embedded EXIF metadata including GPS coordinates, timestamps, and device settings without requiring manual data entry. This geospatial and temporal information is then enriched with relevant environmental context by combining location and time data with weather and water condition information for the same period—capturing details like tide levels, water temperature, wind speed, and cloud cover that would otherwise be lost.

This contextual enrichment approach provides critical grounding for downstream content generation. By embedding actual environmental conditions alongside catch data, the system produces more personalized, authentic, and engaging narratives without additional guide effort. The automation ensures scalable consistency—whether processing a single image or hundreds, every media artifact receives high-quality contextual data, providing reliable inputs for subsequent AI processing stages.

Computer Vision Pipeline for Species Detection

Fish species identification represents one of the core technical challenges and showcases sophisticated computer vision LLMOps practices. The system must not only detect fish in real-world images but accurately classify hundreds of species across highly variable environments including boats, docks, lakes, and offshore locations. Guidesly designed a multilayer computer vision pipeline combining custom-trained models with foundation vision models available through AWS services.

The model development workflow runs primarily inside Amazon SageMaker AI using JupyterLab as the experimentation environment. This setup enables rapid prototyping of new computer vision architectures, large-scale training jobs on GPU-backed instances, evaluation across multiple fish classification benchmarks, and quick iteration between model improvements and production deployment. SageMaker AI serves as the central hub where datasets, training scripts, and model experiments are managed.

The dataset and training challenges are substantial—the system currently supports over 400 fish species classes collected from proprietary fishing report imagery, user-submitted catch photos, and curated partner datasets. Fish identification presents unique ML challenges due to large species counts and uneven data distribution. While popular species have thousands of training examples, many species have limited labeled images making traditional supervised learning difficult.

To address this data imbalance, Guidesly employs a hybrid training strategy combining standard supervised learning for species with large datasets and one-shot and few-shot learning techniques for rare species with limited training data. This approach allows the system to expand classification coverage without requiring extensive datasets for every species—a critical LLMOps consideration for maintaining broad functionality while managing training data acquisition costs.

Rather than relying on a single model, the architecture implements a two-stage vision system separating object detection from species classification. The detection layer uses YOLO-based object detection models trained to identify relevant objects within fishing images including fish, fishing gear, people, boats, and environmental context. These detection models identify bounding boxes for each object, and critically, only the detected fish regions are cropped and passed to the next stage rather than processing entire images. This approach significantly improves classification accuracy by removing unrelated background elements that can confuse classification models.

Each cropped fish image is then passed to a specialized classification model. Over the development lifecycle, the team experimented with several architectures including convolutional neural networks (CNNs), ResNet-based models for strong baseline classification, and one-shot and few-shot models for long-tail species recognition. The combination of architectures balances accuracy, inference speed, and training efficiency across hundreds of species classes.

Hybrid Vision and Foundation Model Integration

Beyond custom-trained models, Guidesly integrates multimodal foundation models available through Amazon Bedrock to provide additional reasoning and contextual understanding. However, the team recognized that raw vision models can hallucinate or misinterpret visual scenes. To mitigate this risk, several preprocessing and context injection steps occur before sending images to foundation models.

The workflow includes image preprocessing (cropping detected fish regions, normalizing image dimensions, removing unnecessary background), context enrichment (media metadata like location, water body, and time; known species distribution data; detection model outputs), and structured system prompts that provide the model with contextual information about the image while constraining possible species predictions. This hybrid approach combines the precision of domain-specific classifiers with the reasoning capabilities of large vision models, representing a sophisticated LLMOps pattern for balancing custom models with foundation model capabilities.

After models are validated through the SageMaker AI experimentation workflow, they are deployed using managed endpoints on Amazon SageMaker AI. This enables real-time inference on uploaded images, automatic scaling for large media volumes, and continuous monitoring with the ability to push model updates. The result is a scalable vision system processing thousands of fishing images across Guidesly’s platform, delivering reliable fish species detection even in complex real-world conditions.

Media Optimization Pipeline

After fish detection and contextual enrichment complete, Jack AI focuses on preparing media for real-world publishing requirements. High-resolution photos and videos uploaded by guides are automatically processed into optimized, web-ready assets designed for websites, social networks, and email campaigns. This optimization pipeline handles compression, resizing, and format conversion behind the scenes, ensuring media files remain lightweight without sacrificing visual quality.

By standardizing assets early in the workflow, Jack AI eliminates manual image editing needs and maintains consistent presentation across devices and platforms. Optimized media is stored as versioned artifacts in Amazon S3 buckets and tagged for straightforward retrieval and reuse. These assets can be surfaced repeatedly across SEO pages, trip reports, newsletters, and social posts without reprocessing, keeping the publishing pipeline fast and efficient. Beyond performance benefits, this optimization step supports SEO goals—fast-loading images improve search rankings, enhance user experience, and reduce bounce rates on guide websites.

Content Generation and Tone Alignment

To ensure generated trip reports feel authentic and aligned with the natural voice of fishing guides, Guidesly implemented a tone improvement layer within the content generation pipeline. Rather than fine-tuning underlying language models—which would introduce operational complexity, training costs, and maintenance overhead—the system optimizes tone through contextual inputs and structured prompting. This approach preserves distinctive storytelling styles guides use while maintaining scalability and operational simplicity.

The foundation of this approach is context injection. Structured trip metadata is embedded directly into model prompts, providing grounded context needed to generate accurate and relevant narratives. Alongside this, historical trip reports and guide-specific phrasing patterns are retrieved and included as reference examples, helping the model mirror vocabulary, pacing, and descriptive style that guides naturally use when documenting trips.

Carefully designed prompts guide foundation models from Amazon Bedrock toward outputs reflecting expected tone, allowing dynamic adjustments to writing style without operational overhead of maintaining fine-tuned models. This represents a pragmatic LLMOps decision—leveraging prompt engineering and retrieval-augmented generation (RAG) patterns rather than custom model training to achieve voice alignment at scale.

To avoid fabricated details and hallucinations, the generation process is constrained strictly to provided metadata and contextual inputs. The model is explicitly instructed not to infer missing information like additional species, techniques, weather, or locations absent from source data. Every report remains consistent with actual trip records—a critical requirement for maintaining trust and authenticity in client-facing marketing content.

Generation executes using Amazon Bedrock foundation models, which process contextual inputs and structured prompts to produce coherent, domain-appropriate reports at scale. By relying on contextual prompting rather than model re-training, the system avoids training infrastructure requirements, reduces operational overhead, and enables rapid iteration as new guide reports and domain patterns emerge. This approach achieves the necessary balance: authentic, guide-style trip reports delivered with reliability, cost efficiency, and scalability needed as the platform grows.

Publishing Pipeline and Orchestration

After content is generated, optimized, and refined for tone, the publishing pipeline orchestrates final delivery across multiple channels. This stage runs end-to-end with minimal manual effort while keeping guides informed through selective notifications. Asset generation is handled through an Assets Generation Step Function that orchestrates multiple AWS Lambda function executions. These functions generate marketing deliverables from artifacts stored in S3 for each trip, including SEO-friendly trip reports, fresh website content, social media posts, and personalized email campaigns.

The outputs are automatically stored in the system and integrated into downstream publishing workflows, eliminating manual drafting, copywriting, or formatting. After assets are ready, guides receive push notifications for review, allowing them to stay informed without unnecessary operational overhead. Publishing controls remain flexible through AI-driven automation—guides can review and approve generated content, request refinements, or enable an auto-publish toggle for full automation. This flexibility allows each guide to balance quality control with efficiency, remaining hands-on when needed or opting for a set-and-forget approach.

Behind the scenes, AWS Step Functions orchestrate multiple AWS Lambda function executions, scaling automatically to accommodate hundreds of guides with minimal infrastructure management. Amazon Simple Notification Service (SNS) handles notification delivery, ensuring guides receive timely updates about generated content without requiring polling or manual checking.

Cost Management and Unit Economics

While the architecture scales automatically, cost per generated trip report remains relatively predictable. In typical scenarios, generating a full report—including media processing, computer vision inference, and content generation—costs approximately $0.10 to $0.50 per report. Final costs vary depending on the number of images processed, presence of video media, and volume of AI inference requests. Because the workflow is serverless and event-driven, guides only incur costs when reports are actually generated, keeping unit economics predictable as usage scales.

This cost structure represents thoughtful LLMOps economic management—leveraging pay-per-use serverless infrastructure and foundation model APIs rather than maintaining dedicated compute infrastructure or custom-trained large models. The result is a system that scales economically both upward and downward based on actual usage.

Production Results and Impact

Since launching Jack AI on AWS, Guidesly has observed rapid adoption and measurable impact across its community of outdoor guides. The system fundamentally transformed one of the most time-consuming aspects of guide operations—marketing and content creation. Previously, guides often spent more than six hours weekly writing trip reports, formatting blog posts, creating social media captions, and attempting content optimization for search. With Jack AI, this work is largely automated—trip photos and short notes uploaded by guides convert into complete marketing-ready asset sets including trip reports, SEO-rich website content, social captions, and email updates, produced in minutes rather than hours.

Jack AI adoption steadily climbed from just over 100 reports in early 2025 to nearly 340 reports by July 2025, reflecting broader industry shift where guides who previously hesitated to embrace technology and digital marketing now rely on Jack AI to build and grow online presence. Content output scaled dramatically from under 800 assets in early 2025 to more than 2,500 by midsummer 2025. Each trip report produces multiple deliverables—SEO artifacts for guide websites, captions for Instagram and Facebook, narrative descriptions tailored to email marketing—allowing guides to maintain authentic, consistent presence across channels.

Content delivery speed improved substantially as the system matured. Asset generation time dropped from 13 minutes in December 2024 to just 2 minutes by August 2025. This speed ensures trip reports, social media posts, and email campaigns are ready almost immediately after trip conclusion, allowing guides to reach clients and networks while experiences remain fresh. This temporal proximity drives higher engagement across social channels and faster responses to post-trip emails. Automated emails showcasing recent trips reach past and current customers within hours, helping convert positive client energy into repeat bookings and word-of-mouth marketing.

The financial impact has been substantial. Among the five most active guides using Jack AI, average monthly revenue grew from approximately $3,000 in January 2025 to more than $27,000 by July 2025—a nearly 9× increase within six months. Guides credited this growth to their ability to maintain steady content flow that boosted visibility in search engines, drove engagement on social networks, and ultimately converted into new bookings.

LLMOps Patterns and Lessons

This case study demonstrates several important LLMOps patterns for production systems. The hybrid approach combining custom-trained computer vision models with foundation models through Amazon Bedrock illustrates how to balance domain-specific accuracy with broad reasoning capabilities. Rather than forcing one approach, the architecture strategically applies each where it provides greatest value—custom models for precise species classification, foundation models for contextual understanding and content generation.

The decision to use contextual prompting and retrieval-augmented generation for tone alignment rather than fine-tuning foundation models represents pragmatic operational thinking. While fine-tuning could potentially achieve similar voice alignment, it introduces training infrastructure requirements, model versioning complexity, ongoing maintenance overhead, and potentially higher inference costs. By achieving comparable results through prompt engineering and context injection, Guidesly maintained operational simplicity while preserving flexibility to iterate quickly as guide styles and content requirements evolve.

The serverless, event-driven architecture pattern demonstrates how to build scalable LLM systems without managing infrastructure. By composing AWS Step Functions, Lambda, S3, RDS, SageMaker AI, and Bedrock, the system automatically scales from processing individual trip reports to handling hundreds concurrently without manual capacity planning or infrastructure management. This approach keeps operational overhead low while supporting rapid growth—critical for a small team building vertical SaaS products.

The workflow orchestration through Step Functions illustrates how to manage complex, multi-stage AI pipelines reliably in production. Each stage (metadata extraction, computer vision inference, media optimization, content generation, publishing) operates as discrete Lambda functions coordinated by Step Functions, providing clear separation of concerns, independent scaling characteristics, retry logic, and observable execution traces for debugging and monitoring.

The cost management approach—tracking per-report costs ranging from $0.10 to $0.50—shows how serverless and foundation model APIs enable predictable unit economics. Rather than amortizing fixed infrastructure costs across usage, the pay-per-use model aligns costs directly with value delivery, making the economics transparent and scalable.

Finally, the system demonstrates thoughtful handling of AI reliability concerns, particularly around hallucination prevention. By constraining content generation strictly to provided metadata and contextual inputs, explicitly instructing models not to infer missing information, and combining custom detection models with foundation model reasoning, the architecture maintains factual accuracy necessary for client-facing marketing content where trust is paramount.

More Like This

Agentic AI Copilot for Insurance Underwriting with Multi-Tool Integration

Snorkel 2025

Snorkel developed a specialized benchmark dataset for evaluating AI agents in insurance underwriting, leveraging their expert network of Chartered Property and Casualty Underwriters (CPCUs). The benchmark simulates an AI copilot that assists junior underwriters by reasoning over proprietary knowledge, using multiple tools including databases and underwriting guidelines, and engaging in multi-turn conversations. The evaluation revealed significant performance variations across frontier models (single digits to ~80% accuracy), with notable error modes including tool use failures (36% of conversations) and hallucinations from pretrained domain knowledge, particularly from OpenAI models which hallucinated non-existent insurance products 15-45% of the time.

healthcare fraud_detection customer_support +90

AI-Powered Marketing Content Generation and Compliance Platform at Scale

Volkswagen 2025

Volkswagen Group Services partnered with AWS to build a production-scale generative AI platform for automotive marketing content generation and compliance evaluation. The problem was a slow, manual content supply chain that took weeks to months, created confidentiality risks with pre-production vehicles, and faced massive compliance bottlenecks across 10 brands and 200+ countries. The solution involved fine-tuning diffusion models on proprietary vehicle imagery (including digital twins from CAD), automated prompt enhancement using LLMs, and multi-stage image evaluation using vision-language models for both component-level accuracy and brand guideline compliance. Results included massive time savings (weeks to minutes), automated compliance checks across legal and brand requirements, and a reusable shared platform supporting multiple use cases across the organization.

content_moderation classification multi_modality +45

Scaling Agentic AI for Digital Accessibility and Content Intelligence

Siteimprove 2025

Siteimprove, a SaaS platform provider for digital accessibility, analytics, SEO, and content strategy, embarked on a journey from generative AI to production-scale agentic AI systems. The company faced the challenge of processing up to 100 million pages per month for accessibility compliance while maintaining trust, speed, and adoption. By leveraging AWS Bedrock, Amazon Nova models, and developing a custom AI accelerator architecture, Siteimprove built a multi-agent system supporting batch processing, conversational remediation, and contextual image analysis. The solution achieved 75% cost reduction on certain workloads, enabled autonomous multi-agent orchestration across accessibility, analytics, SEO, and content domains, and was recognized as a leader in Forrester's digital accessibility platforms assessment. The implementation demonstrated how systematic progression through human-in-the-loop, human-on-the-loop, and autonomous stages can bridge the prototype-to-production chasm while delivering measurable business value.

content_moderation summarization classification +39