Changelog

October 2025

OCT 29 NEW - Added pagination controls for integration logs

Integration logs now display pagination controls when there are multiple pages of results. Users can navigate between pages using Previous/Next buttons and see their current position (e.g., “Page 1 of 5”). A results counter shows the range of logs being displayed (e.g., “Showing 1-20 of 87 results”). Pagination controls only appear when there is more than one page of logs available.

OCT 29 NEW - Custom Date Range Filter, Search, and CSV Export for Integration Logs

Added comprehensive filtering and export capabilities for integration logs. Users can now search logs by prompt/response content with debounced input, filter by custom date ranges (24 hours, 7 days, 30 days, or 90 days), adjust page size (10, 25, 50, or 100 items), and export filtered logs to CSV format. The CSV export includes timestamp, model, prompt, response, token counts, costs, latency, and status code for comprehensive log analysis.

OCT 25 NEW - Export insights reports to CSV and improved chart tooltips with hover tracking

Added ability to export comprehensive insights reports to CSV format, including summary statistics, daily/weekly/monthly active users, request timelines, top users and spaces, model distribution, and tool usage data. Enhanced chart visualization with improved tooltips that follow mouse position and added informational tooltips explaining DAU (Daily Active Users), WAU (Weekly Active Users), and MAU (Monthly Active Users) metrics. Charts now include proper padding to prevent tooltip cutoff at edges.

OCT 24 FIX - Fixed image attachment support for Grok-4 and OpenAI models

Fixed an issue where attached images were not properly sent to Grok-4, Grok-4-fast, Grok-4-fast-reasoning, and Grok-4-fast-reasoning models. The fix enables vision support for these Grok models and ensures image content is correctly serialized by excluding empty fields when sending requests to vision-capable models. Images can now be successfully processed by these models according to xAI’s vision capabilities documentation.

OCT 24 FIX - Fixed timeout and streaming errors when processing large files with context chunking

Resolved issues that prevented processing of large files by automatically applying context chunking as a fallback when no models fit the context window. The engine now disables smart_learn during chunking operations, extracts and combines large content from all messages (over 1000 characters), and creates semantic chunks to reduce token count. This prevents “No models fit” errors and enables successful processing of files that previously exceeded model context limits.

OCT 23 FIX - Fixed context window overflow errors when processing large documents

Fixed automatic document chunking failing when content exceeded 100,000 characters by disabling SmartLearn for large documents and preventing recursive chunking loops. The system now properly handles documents that exceed 30% of a model’s context window (threshold for models with >2000 token limits) by automatically splitting content into manageable chunks without triggering embedding model token limit errors or infinite chunking recursion.

OCT 14 IMPROVEMENT - Optimized Insights query performance to prevent memory issues

Improved the performance and reliability of the Insights page by optimizing database queries to filter data earlier in the process. The queries now filter requests by date range at the database level and only fetch records with relevant fields (tool_calls and model data), significantly reducing memory usage and preventing out-of-memory errors. Additionally, increased API server memory limit from 3GB to 6GB to handle larger datasets more reliably.

OCT 14 IMPROVEMENT - Improved Insights page performance and chart interactions

Enhanced the organization Insights page to prevent unnecessary API calls by only triggering data refetch when custom date filters are actually applied, not on every keystroke. Added interactive chart tooltips that appear on hover showing precise data points with formatted dates and values. Improved chart visual design with gradient fills, smoother lines, and better hover states for data points.

OCT 14 NEW - Added Insights Beta for organization analytics and usage tracking

Introduced a comprehensive Insights dashboard for organization admins to analyze usage patterns with default 30-day views and period comparison support. The insights include user activity metrics (DAU/WAU/MAU), monthly breakdowns, model distribution, tool usage statistics, router usage data, top apps, and agent vs user request tracking. Users can now specify custom date ranges and compare different time periods to track organizational trends.

OCT 13 IMPROVEMENT - Increased Suite X subscription daily API request limit from 20 to 400

Suite X subscription tier now includes 400 daily API requests, a significant increase from the previous limit of 20 requests per day. This 20x improvement provides Suite X subscribers with substantially more capacity for generation API endpoints, enabling more extensive usage of AI features throughout the day.

OCT 13 FIX - Fixed function calls and tool calls in streaming mode

Fixed streaming responses to properly support function/tool calls across all providers including OpenAI, Anthropic, xAI, and Google. The system now automatically enables tool support when tools are provided in requests, correctly converts tool formats between provider specifications (OpenAI to Anthropic/Google formats), and properly handles tool_choice parameters for each provider. This ensures external agents can receive and execute tool calls in streaming mode with full OpenAI compatibility.

OCT 09 NEW - Added automatic context chunking for large documents exceeding model limits

Introduced a new context chunking tool that automatically splits large content into manageable chunks when it exceeds 95% of a model’s context window. When triggered, content is divided into chunks of up to 100,000 tokens (with 1,000 token overlap, max 100 chunks), and the LLM is automatically instructed to process each chunk sequentially using the context_chunking tool, then synthesize the results. This enables analysis of documents that would otherwise exceed context limits without manual intervention.

OCT 09 FIX - Fixed missing logo for baseline evaluations in custom router dialog

Fixed an issue where baseline evaluation runs in the custom router creation dialog were not displaying the correct Pulze logo. Baseline evaluations now properly show the Pulze favicon icon instead of no icon, making it easier to visually identify baseline runs in the evaluation list.

OCT 09 IMPROVEMENT - Unified model and router selection with improved overflow handling

Replaced the separate “Dynamic Router” option with a unified model/router selector in the assistant creation dialog and tool configuration. The selector now combines both models and routers in a single dropdown menu. Additionally, improved UI rendering by adding proper overflow handling and flex layout to prevent text truncation issues in long model/router names.

OCT 09 FIX - Fixed 422 errors and incomplete chunk processing in chat workflow

Resolved two critical issues in chat processing: (1) Fixed 422 errors caused by context window limits by adjusting the automatic chunking threshold from 95% to 30% of context window to account for tokenizer differences across providers (Anthropic counts ~63% more tokens) and content expansion after plugins run (up to 3x expansion observed). (2) Fixed incomplete chunk processing where AI would stop after the first chunk by adding explicit instructions to continue processing all remaining chunks, ensuring complete analysis of large documents split across multiple chunks.

OCT 07 IMPROVEMENT - Improved baseline benchmark model display with accurate creation dates

Enhanced the baseline benchmark display to show accurate creation dates for each evaluated model instead of placeholder dates. The system now reads metadata from benchmark.json files to properly display when each baseline model was evaluated, providing better temporal context when comparing model performance. Updated the baseline benchmark ID to use the latest benchmark dataset.

OCT 07 IMPROVEMENT - Added default settings display (max_tokens & temperature) to Space overview

Enhanced the Space overview interface to display default model settings including max_tokens and temperature parameters. Users can now see the default generation settings configured for their Spaces at a glance.

OCT 03 FIX - Fixed evaluation runs getting stuck when pausing/resuming

Fixed a critical issue where paused evaluation runs could become stuck due to stale resume requests. When pausing an evaluation run, the system now properly clears the resume_requested_at field to prevent the run from automatically resuming unintentionally. Additionally, increased the evaluation worker capacity from 1 process/1 thread to 4 processes/8 threads for better performance, and added automated cleanup tasks to detect and remove duplicate evaluation results while maintaining data consistency.

OCT 02 FIX - Fixed pause/resume for multi-model evaluations with non-running child runs

Fixed an issue where pausing multi-model evaluation runs only paused child runs in RUNNING state, leaving PENDING or FAILED runs unaffected. Now all non-completed child runs (RUNNING, PENDING, FAILED, etc.) are properly paused when pausing a parent evaluation, ensuring consistent state when resuming. Completed child runs are correctly preserved and skipped during pause operations.

OCT 01 FIX - Fixed child evaluation run display overflow in evaluation runs view

Fixed a layout issue where child evaluation runs could overflow their container by replacing ‘flex-shrink-0’ with ‘min-w-0 max-w-full’. This ensures that child run cards (displaying router or model information) properly respect container boundaries and wrap text instead of causing horizontal overflow, improving readability in the evaluation runs interface.

September 2025

SEP 30 MODEL - Added Anthropic Claude Sonnet 4.5 models with 200K context window

Added two new Anthropic models: claude-sonnet-4.5 (latest alias) and claude-sonnet-4.5-20250929 (fixed version). Both models feature 200K context window, support for vision/image input, function calling, JSON output, and streaming. Models have exceptional agent and coding capabilities with prompt costs of

0.003/1K tokens and completion costs of

0.015/1K tokens. Fixed model failover chain behavior to properly clean up model settings when failover is disabled.

SEP 30 FIX - Improved error handling and fixed router/model selection behavior

Enhanced error display with better handling of network interruptions and JSON parsing errors, now showing user-friendly refresh options. Fixed model router selection logic to properly handle fallback scenarios and maintain selection state when switching between routers and models. Also corrected evaluation page navigation and improved stream connection reliability.

SEP 28 MODEL - Major model deprecation schedule update for Q4 2025

Extensive model deprecation schedule affecting multiple providers: Claude 3.5 Sonnet (Oct 22, 2025), Claude 3 Opus (Jan 5, 2026), Gemini 1.5 series (Sept 22, 2025), Google Gemini 2.0 Flash models (Feb 5, 2026), and various models from Together, Groq, Cohere, and Fireworks. Additionally, several models from Anthropic, Fireworks, Mistral, and OpenAI will no longer be pre-selected by default in new spaces.

SEP 28 NEW - Added bulk sampling and subject selection tools to benchmark browser

Enhanced the benchmark browser with new bulk selection capabilities, including the ability to select all subjects at once, sample specific subjects, and perform bulk subject sampling with customizable sample sizes and random seeds. Users can now efficiently select multiple benchmark items by subject, with detailed feedback on sample distribution and warnings for subjects with insufficient data. Added support for xAI provider with new logo integration.

SEP 18 NEW - Added dataset management and evaluation framework (Pulze V1.0)

Introduced a comprehensive evaluation and dataset management system allowing users to create, manage, and run evaluations on their AI models. Users can now create datasets with custom prompts, expected answers, and system instructions, then use evaluation templates with configurable metrics and rater models to assess model performance. The system supports both manual and benchmark datasets, with detailed progress tracking and scoring capabilities.

SEP 18 IMPROVEMENT - Enhanced Avatar component with fallback initials and system user handling

Improved the Avatar component with intelligent fallback handling: displays user initials when profile pictures fail to load, and shows the Pulze favicon for system users. Added better error handling for image loading, smart initials generation from names (using first and last initials), and special styling for system user avatars with dedicated padding and rounded borders.

August 2025

AUG 27 MODEL - Add GPT-5 and Claude Opus 4.1, expand Claude Sonnet 4.0 context window

Claude Sonnet 4.0’s context window has been expanded to 1M tokens, with updated pricing for long contexts (>200K tokens):

6/MTok for prompts and

22.50/MTok for completions. Added new Claude Opus 4.1 model with 200K context window (

15/MTok prompt,

75/MTok completion) featuring vision support, function calling, and streaming capabilities. GPT-5 models were also added (details truncated in diff) with support for functions, JSON, vision, and streaming.

AUG 27 IMPROVEMENT - Enhanced PDF processing reliability and renamed refresh endpoint to reprocess

Improved the reliability of custom data processing by adding a new reprocess endpoint for both app and organization-level data files. The ‘/refresh’ endpoint has been renamed to ‘/reprocess’ for better clarity, and direct database updates have been implemented for more reliable state management. The update includes enhanced handling of synced files and better error messaging for unprocessable files.

AUG 27 IMPROVEMENT - Improved PDF processing reliability with new 'Process Again' option

Added a new ‘Process Again’ option to reprocess PDFs and web pages that may have failed initial processing. This feature is available both at the organization level and within individual apps through a new action menu. The UI has been updated to show clearer feedback messages during reprocessing attempts, and the ‘Refresh’ button has been renamed to ‘Retry’ for clarity.

June 2025

JUN 25 MODEL - Switched conversation naming model to Groq's LLaMA-3-70B

Changed the model used for generating conversation names from OpenAI’s GPT-4-Nano to Groq’s LLaMA-3-70B-Instruct. This change aims to improve the quality and speed of auto-generated conversation titles.

JUN 25 IMPROVEMENT - UI Enhancements: Improved Navigation & Message Source Display

JUN 20 MODEL - Added O3-Pro model with 200K context window and advanced capabilities

Added OpenAI’s O3-Pro model and its snapshot version (o3-pro-2025-06-10) with a 200K token context window. The model supports streaming, vision, JSON output, function calling, and multiple completions. It features higher compute capabilities for complex problems, with token costs of 0.015¢ for prompt and 0.06¢ for completion tokens.

JUN 20 IMPROVEMENT - Enhanced document source handling in chat interface

Improved source document handling in chat interface with new document type detection and direct document preview functionality. Users can now click on document sources to open them in a new tab, while non-document sources show detailed information in a popup. Added visual indicators to distinguish between document and non-document sources, with a new document icon for PDF/document sources.

JUN 11 IMPROVEMENT - Updated default tool configurations and extended free trial period

Added support for several new AI tools including Gmail read, Slack read, LinkedIn profile access, transcribe audio, and multi-turn image editing capabilities. DALL-E image generation is now disabled by default. The free trial period for new subscriptions has been extended from 3 to 7 days. Default configurations now include more granular human-in-the-loop settings for various tools.

JUN 11 IMPROVEMENT - Enhanced Pro Assistant capabilities and extended free trial period

Extended the free trial period from 3 to 7 days for Pro plan upgrades. Added McpTool capability to multiple Pro Assistants including Content Writer, Business Analyst, and Research Assistant. Enhanced configurable options for Pro Assistants by adding a new configuration button and MultiTurnImageEditing tool to select assistants like Project Manager and Wellness Coach.

May 2025

MAY 31 FIX - Fixed null reference error in member management dialog

Fixed crashes in the Add Member dialog when handling partially initialized user data. The dialog now properly validates user objects before filtering, displaying, and processing member additions, ensuring stable operation when managing space members with incomplete profile information.

MAY 29 IMPROVEMENT - Enhanced multi-turn chat with artifact tracking and file history support

Added support for maintaining context across chat conversations by tracking generated artifacts (images, transcripts, documents) and attached files throughout the conversation history. The system now automatically scans previous messages for pulze:// URLs and file references, preserves them in metadata, and makes them available to plugins in subsequent turns. This enables more coherent multi-turn interactions where AI can reference and work with previously generated content or uploaded files.

MAY 29 NEW - Added Multi-turn Image Editing tool with advanced configuration options

Added a new Multi-turn Image Editing tool that supports interactive image generation and editing. Users can configure the tool with multiple GPT models (including gpt-4o, gpt-4.1, o3) and customize image parameters like size (up to 1536x1024), quality levels (high/medium/low), and background types (transparent/opaque). The tool includes automatic parameter selection and supports various image dimensions with flexible quality settings.

MAY 28 NEW - Add MCP (Model Context Protocol) Tool Integration Support

Added new endpoints for configuring and managing MCP tool integrations. Users with editor permissions can now configure MCP server credentials and disconnect MCP tool connections through the API. The integration is managed through the linked accounts system, allowing organizations to maintain separate MCP tool configurations per user.

MAY 28 FIX - Layout improvements and updated model banner

Fixed layout issues in the MCP server configuration dialog by adding proper width constraints and flex behavior to prevent text overflow. Updated the promotional banner to reference Claude 4 instead of Claude 3.7, alongside GPT-4.1 and DeepSeek-R1 models.

MAY 27 FIX - Improved model scoring reliability and plugin initialization

Enhanced error handling for the model scoring system to gracefully handle API failures and connection issues. Now falls back to default model scores (featuring Claude Sonnet 4.0) when the scoring service is unavailable. Also fixed initialization of query plugins including file/URL handling and RAG query rewrite functionality, with improved logging for better troubleshooting.

MAY 27 MODEL - Added Claude 4 Sonnet and Opus models with 200K context windows

Added four new Claude models: claude-sonnet-4-0, claude-sonnet-4-20250514, claude-opus-4-0, and claude-opus-4-20250514. All models feature 200K token context windows, support for functions, streaming, and vision capabilities (except dated versions). Pricing is set at

0.003/1K tokens for input and

0.015/1K tokens for output. Claude Opus 4 is specifically optimized for coding tasks and complex, long-running workflows.

MAY 27 FIX - Fix Advanced Workflow tool configuration preservation

Fixed an issue where Advanced Workflow tool configurations were being automatically overwritten when loaded from the database. The system now properly preserves existing tool configurations, maintains the correct recipe order from the latest version, and automatically handles legacy configurations by converting them to the advanced workflow format when multiple tools are detected.

MAY 27 IMPROVEMENT - Improved file search functionality and simplified plugin credential handling

Updated the file search input schema to clarify file reference formatting requirements, making it easier for users to understand the correct syntax. File references now use a simpler format ‘<!file:UUID:filename>’ or ‘<!url:UUID:url>’ without requiring quotes or JSON formatting. Also streamlined the plugin credential handling system by removing redundant placeholder code.

MAY 27 NEW - Free users are now automatically redirected to onboarding page

Added automatic redirection of free-tier users to the /onboarding page whenever they attempt to access other sections of the application. This new feature ensures free users complete the onboarding process before accessing the main application features. The redirect persists until users upgrade from the FREE subscription tier.

MAY 23 NEW - Added MCP (Model Context Protocol) tools integration support

Added support for integrating with external MCP servers to discover and use their tools within the agent system. This allows connecting to multiple MCP servers simultaneously, automatically discovering their available tools, and using them through a standardized interface. Each MCP server can provide multiple specialized tools with defined input schemas and capabilities that can now be used alongside existing tools.

MAY 21 FIX - Fix horizontal overflow in tool call input display

Fixed an issue where tool call input details could overflow beyond the container width. Added horizontal scrolling to tool call input displays, ensuring long JSON content remains accessible without breaking the layout.

MAY 20 FIX - Fixed credential injection for Slack, Gmail, and LinkedIn tools in default assistants

Fixed an issue where user credentials were not being properly injected into Slack, Gmail, and LinkedIn tools when using default Pulze assistants. Previously, these integrations would fail to authenticate when not using a custom assistant. The fix ensures that user credentials are now correctly loaded and injected for all assistant types, improving reliability of third-party tool integrations.

MAY 14 TECHNICAL - Remove debug tool configuration display from Edit Tools dialog

Removed visible JSON debug output that was showing tool configuration details in the Edit Tools dialog. This improves the UI cleanliness by removing technical information that was not meant for end users.

MAY 08 FIX - Fix user access permissions for unverified email accounts

Removed email verification requirement for accessing dashboard data, application lists, and logs. Users can now access these features immediately after account creation, without waiting for email verification. Additionally, added new Stripe price mappings for PulzeOne and PulzeXSuite subscription tiers with monthly, quarterly, and yearly billing cycles.

MAY 07 NEW - Add PulzeOne and PulzeXSuite subscription tier enforcement rules

Implemented specific subscription tier limits for PulzeOne and PulzeXSuite users. PulzeOne tier is limited to 20 datasources per organization with no special API requests allowed. PulzeXSuite tier allows up to 50 datasources and 20 special API requests per day. Both tiers include seat limit enforcement for organization members.

MAY 07 NEW - Added pro assistant avatar 'Buzz' with animated expressions

Added new professional assistant avatar ‘Buzz’ with detailed SVG graphics including animated facial expressions, eye movements, and emotional responses. The avatar features a distinctive black and white color scheme with interactive elements like blinking eyes, dynamic mouth movements, and responsive emotional states.

MAY 05 NEW - Add linked accounts system with LinkedIn integration

Added support for linking external service accounts to organization members, starting with LinkedIn integration through the partner API. Users can now connect and disconnect their LinkedIn accounts through a new linked accounts management system, with each account type uniquely constrained per organization member. The implementation includes a new database schema for linked accounts and OAuth-based authentication flow.

April 2025

APR 17 FIX - Fixed assistant avatar image rendering and sizing

Fixed the rendering of custom assistant avatar images by properly applying size classes and image formatting. Custom avatars now correctly display in 5 different sizes (sm: 24px, md: 32px, lg: 40px, xl: 56px, 2xl: 72px) with consistent rounded corners and proper scaling.

APR 16 NEW - New Pro Assistants & Subscription-Based Onboarding Flow

Introduced a new onboarding flow with pro-level assistants and subscription-based access control. Users can now filter assistants by pro-only status, and assistant availability is automatically managed based on subscription tier (Free, PulzeOne, or higher). Organizations with admin privileges can enable/disable assistants globally, with automatic management for Free/PulzeOne subscriptions through a new checkout process.

APR 15 FIX - Fixed overflowing table displays in model responses

Fixed an issue where tables in model responses could overflow beyond the visible area on smaller screens. Tables now automatically scroll horizontally when their content exceeds the container width, ensuring all data remains accessible while maintaining the visual layout.

APR 14 MODEL - Add OpenAI GPT-4.1 family of models with 1M context window

Added six new OpenAI GPT-4.1 models with 1 million token context windows: gpt-4.1, gpt-4.1-mini, and gpt-4.1-nano (plus their dated variants). All models support functions, streaming, vision, JSON output, and penalties. The models offer different price points, with gpt-4.1-nano being the most cost-effective at 0.0001¢/prompt token and 0.0004¢/completion token, while gpt-4.1 costs 0.0002¢/prompt token and 0.0008¢/completion token.

APR 14 NEW - Added support for Excel (XLSX/XLS) and PowerPoint (PPTX/PPT) file formats

Added support for uploading and handling Microsoft Office spreadsheet (XLSX, XLS) and presentation (PPTX, PPT) file formats. Users can now work with these additional file types alongside existing document formats like PDF, DOC, DOCX, and TXT.

APR 04 IMPROVEMENT - Enhanced plugin system with selective plugin loading and improved validation

Added support for selectively loading specific plugins through request payload, allowing more precise control over which tools are available during interactions. The system now better handles models with function-calling capabilities by automatically configuring appropriate tools. Also fixed validation issues with document IDs in custom data handling by making them optional.

APR 01 IMPROVEMENT - New data sync infrastructure with enhanced connection management

Introduced a new data synchronization system that replaces the previous RAG integration with a more flexible connection framework. Added support for tracking connection states through new fields (connection_id, document_id, source_type, last_synced_on) and reorganized API endpoints to use ‘/connect’ and ‘/webhook’ instead of ‘/carbon’. This change provides a more robust foundation for managing external data connections and synchronization.

APR 01 NEW - Added Crisp customer support chat and Spaces analytics integration

Added Crisp live chat support widget to help users get real-time assistance. Also integrated TikTok analytics pixel tracking for Spaces pages to better understand user engagement and behavior patterns.

APR 01 NEW - Added default assistant selection in space home view

Users can now set and manage a default assistant for their workspace directly from the space home page. The interface displays the currently selected default assistant with options to remove it, or create a new assistant if none is set. Default assistants will be automatically selected when starting new conversations in the space.

APR 01 FIX - Fix tool status switch behavior when viewing non-existent tools

Fixed an issue where the status toggle switch in tool details would render incorrectly when accessing tools that don’t exist. The fix adds proper key handling to the switch component, ensuring consistent rendering and state management for tool status toggles.

March 2025

MAR 28 IMPROVEMENT - Added LinkedIn Profile plugin support and improved plugin configuration management

Added support for the LinkedIn Profile plugin and improved how plugin configurations are handled. Plugin configurations can now be managed globally through organization settings instead of individual assistant configurations. This change specifically affects the LinkedIn Profile plugin, which now reads connected account information from global organization configurations.

MAR 28 FIX - Fix partner webhook URL and improved LinkedIn Profile tool configuration

Fixed an issue with partner webhook URL formatting and enhanced LinkedIn Profile tool management by centralizing its configuration at the organization level. Organization administrators can now manage LinkedIn Profile tool settings globally, which will automatically apply to all assistants using this tool. When assistants use the LinkedIn Profile tool, they will inherit organization-level configurations while maintaining the ability to have assistant-specific overrides.

MAR 27 NEW - Added external service authentication integration with partner

Added support for connecting multiple external services (LinkedIn, WhatsApp, Instagram, Messenger, Telegram, Google, Microsoft, IMAP, X) through partner authentication flow. Users can now generate hosted authentication links to connect their accounts with automatic service type mapping and one-hour expiration. The integration includes success/failure redirects and webhook handling for connection status.

MAR 25 IMPROVEMENT - Enhanced Web Search with timeout controls and new Exa Search integration

Improved web search functionality with a configurable 120-second timeout to prevent hanging searches, plus better error handling and connection management. Added support for Exa API integration for advanced search capabilities, with configurable maximum results (default: 20) and custom timeout messages. Search service now includes improved connection health monitoring and handles large response messages up to 100MB.

MAR 21 IMPROVEMENT - Enhanced Gmail label handling and email data structure in Gmail Read plugin

Improved Gmail label processing to support comma-separated label values and enhanced email parsing with more comprehensive recipient information. The plugin now returns structured email data including sender, recipient, thread ID, and truncated message body (limited to 500 characters), making it more robust for handling multiple labels and email metadata.

MAR 19 FIX - Fixed Gmail OAuth redirect URI for draft poster tool

Fixed incorrect OAuth redirect URI handling for the Gmail draft poster tool. The system now correctly constructs the OAuth redirect URL using the appropriate environment-specific API URL (localhost, development, or production) combined with the OAuth callback path. This resolves authentication issues when using the Gmail draft posting feature.

MAR 19 NEW - Added Gmail draft creation integration with OAuth authentication

Added new Gmail integration allowing users to create email drafts directly through the API. Includes secure OAuth2.0 authentication flow for connecting Gmail accounts with automatic token refresh handling. Users can now authorize access to their Gmail account through a popup window, with the integration storing credentials securely for future use.

MAR 19 NEW - Added automated task scheduling system with flexible configuration options

Introduced a new task scheduling system that allows users to create and manage automated tasks with custom schedules. Tasks can be associated with specific apps or organizations, support plugin integrations, file attachments, and different LLM models. Users can configure task names, prompts, schedule types, and view task status, last run time, and next scheduled run through both app-specific and organization-wide interfaces.

MAR 17 IMPROVEMENT - Improved conversation naming with GPT-4o-mini model

Enhanced conversation naming by using the dedicated GPT-4o-mini model, which is now hardcoded for generating conversation titles. This change ensures more consistent and higher-quality conversation names when creating new chats. The model is configured with a 60-token limit to generate concise, relevant titles.

MAR 17 IMPROVEMENT - Enhanced mobile responsiveness for Space home page with collapsible settings panel

Added mobile-friendly layout for Space home page with a collapsible settings panel that automatically hides on mobile devices (screen width < 768px). Users can now toggle space settings via a floating button, and the assistant search preview has improved text truncation for better mobile display. The settings panel includes default model selection and member management in a more accessible format for small screens.

MAR 16 FIX - Fixed initial model selection in Space Home view

Fixed an issue where the default model selection wasn’t properly handling failover models. The system now correctly initializes the default model by first checking available failover models, then falling back to regular models, and finally defaulting to SMART_MODEL if no others are available.

MAR 15 IMPROVEMENT - Improved UX for accessing disabled assistants with admin permissions

Enhanced the handling of disabled assistants by adding direct navigation to permissions page when clicking disabled assistants. Disabled assistants now show a clearer visual state with 50% opacity and a light red background, plus an improved message ’🔒 Assistant is disabled. Admins click here to enable it.’ This provides administrators a more intuitive way to enable assistants directly from the grid or search views.

MAR 14 IMPROVEMENT - Improved assistant creation handling for non-admin users

Enhanced the assistant creation process to gracefully handle permissions when users aren’t organization administrators. The system now properly validates user access permissions before attempting to add assistants to organization configurations, preventing potential errors for non-admin users.

MAR 14 IMPROVEMENT - Auto-enable assistants when created by organization administrators

Organization administrators can now create assistants that are automatically enabled for their entire organization. When an org admin creates a new assistant, it is immediately added to the organization’s configuration with enabled status, eliminating the need for manual activation. This streamlines the assistant deployment workflow for organization administrators.

MAR 14 FIX - Fix 'Try Now' button for assistants when duplication is disabled

Fixed an issue where assistants were incorrectly enabled by default in the global configuration. Now, assistants are disabled by default unless explicitly enabled in the organization’s configuration, with exceptions for draft assistants and those owned by the current application. This ensures the ‘Try Now’ button works correctly based on proper permission settings.

MAR 03 FIX - Fix model selection behavior for assistants using dynamic router

Fixed model selection logic when using assistants to properly respect model configuration. When ‘pulze’ is specified as the model, the system now correctly falls back to using the assistant’s configured model, max tokens, and temperature settings instead of overriding them. This ensures consistent behavior when using dynamic model routing with assistants.

February 2025

FEB 28 IMPROVEMENT - Added default model selection with smart routing option

Added a new model selector component that allows users to set a default model for their space, including a smart routing option that automatically selects the best model based on queries. The selector includes model descriptions, tooltips for each option, and provides visual feedback when selections are made. Changes are saved automatically and persist across sessions.

FEB 27 IMPROVEMENT - Streamlined space home layout with simplified headers and improved navigation

FEB 26 MODEL - Added Claude 3.7 to available models list

Added Claude 3.7 to the platform’s promotional banner alongside existing o3-mini and DeepSeek-R1 models. The banner on the landing page now displays all three available models to users.

FEB 25 MODEL - Added Claude 3.7 Sonnet models with 200K context window

Added two new Anthropic models: claude-3-7-sonnet (latest version alias) and claude-3-7-sonnet-20250219, both featuring a 200,000 token context window. These models support function calling, streaming, and chat functionality, with token costs of

0.003/1K for input and

0.015/1K for output tokens. Vision capabilities are not supported.

FEB 24 FIX - Fix Gemini handling of plugin results data structure

Fixed an issue where Gemini model interactions would fail when processing plugin results in an unexpected format. The system now properly handles both dictionary and non-dictionary plugin results by automatically converting non-dictionary responses into a structured format with an ‘original_prompt’ key. This improves reliability when using plugins with Gemini models.

FEB 22 FIX - Fixed chat completion API error handling for missing attributes

Improved error handling in chat completion API calls by adding defensive checks for missing ‘tool_calls’ and ‘usage_metadata’ attributes. This prevents API failures when these optional attributes are not present in model responses, making the API more robust and reliable.

FEB 19 IMPROVEMENT - Enable streaming for O1 models and improve human-in-loop functionality

Added streaming support for O1, O1-preview, and O1-mini OpenAI models. Fixed human-in-loop functionality to properly handle plugin names with hyphens. Enhanced security by masking API keys in assistant tool configurations alongside other sensitive credentials like access tokens and consumer secrets.

FEB 19 NEW - Added API Request Plugin for External API Integration

Added a new API Request plugin that enables making HTTP requests to external API endpoints. The plugin supports configurable URLs, HTTP methods (GET, POST, PUT, DELETE), custom headers including API key authentication, and flexible request body handling with JSON validation. Plugin can also combine results from other plugins as input data.

FEB 19 IMPROVEMENT - Changed handling of organization tools availability status

Modified how tool availability is determined across the platform. Previously undefined tools were treated as enabled by default, now they are considered disabled until explicitly enabled. This affects tool visibility in assistant creation dialogs and the permissions management interface, providing more consistent tool availability management.

FEB 17 IMPROVEMENT - Enhanced web search plugin with context from previous plugin results

The web search plugin now supports chaining results from other plugins by incorporating their output as context. Added ability to configure whether to use previous plugin results through the ‘data’ configuration array, allowing more sophisticated search queries that build on earlier plugin responses. This enables multi-step reasoning chains where web searches can be informed by context from other plugin executions.

FEB 17 IMPROVEMENT - API Keys now ordered by date and searchable by name

Organization API keys are now sorted by creation date (newest first) instead of alphabetically. Added the ability to filter API keys by name using a search parameter. This makes it easier to find specific API keys in organizations with many tokens.

FEB 16 IMPROVEMENT - Standardize email handling for organization invites

All email addresses for organization invitations are now automatically converted to lowercase to prevent duplicate invites and ensure consistent matching. This means invites sent to ‘[email protected]’ and ‘[email protected]’ will be treated as the same email, improving the reliability of the invitation system and preventing potential confusion with case-sensitive email addresses.

FEB 13 IMPROVEMENT - Auto-select newly added tools in assistant configuration

When adding a new tool in the advanced tools configuration, it now automatically selects the newly created tool for editing. Also improved handling of empty instructions and prompts in the model selector configuration to prevent undefined values.

FEB 11 NEW - Enhanced API key management with granular permissions

Added comprehensive API key management functionality including the ability to create, update, delete, and regenerate API keys with granular permissions. New features include tracking key creation/modification dates, associating keys with specific users via auth0_id, and improved token access validation. API keys can now be managed individually with custom names and permission sets.

FEB 11 FIX - Fixed prompt logging to show only latest user message instead of full conversation

Modified how prompts are logged in chat completions to only store the most recent user message instead of the entire conversation history. This improves log readability and fixes issues where system prompts and previous messages were unnecessarily included in logs. The change also includes a slight adjustment to Anthropic model scoring penalties from -0.14 to -0.13.

FEB 05 FIX - Fix handling of multi-part chat messages with lists and text content

Fixed handling of chat messages containing multiple content parts or lists, ensuring proper concatenation of text content in message processing and scoring. The update improves support for complex message structures in chat conversations by correctly handling both string and list-based content types, with proper text extraction and formatting.

FEB 05 MODEL - Added Google Gemini 2.0 Flash model with 1M token context window

Added support for Google’s Gemini 2.0 Flash model (gemini-2.0-flash-001) with a 1M token context window. This next-gen model features superior speed, native tool use, multimodal capabilities including vision support, and function calling. The model supports streaming and has token costs of 0.1¢ per 1M prompt tokens and 0.7¢ per 1M completion tokens.

FEB 04 MODEL - Added Groq LLaMA 3.3 70B and LLaMA 3.2 90B Vision models

Added two new Groq models: LLaMA 3.3 70B Versatile (128K context) for multilingual tasks and LLaMA 3.2 90B Vision Preview (128K context) for image analysis and reasoning. LLaMA 3.2 90B Text Preview has been deprecated as of November 25, 2024. Both new models support streaming, function calling, and chat functionality, with the Vision model adding specific image processing capabilities.

FEB 03 NEW - Added global assistant enabling/disabling controls in organization settings

Added ability to globally enable or disable specific assistants through organization configuration settings. Each assistant now includes a ‘globally_disabled’ flag that can be controlled via the global configuration, allowing organization administrators to centrally manage assistant availability across their organization. This change synchronizes assistant status with global configuration settings when listing or retrieving assistants.

FEB 03 IMPROVEMENT - Enhanced tool descriptions and UI labels in assistant creation dialog

Improved the tool configuration interface by replacing simple labels with comprehensive tool information including detailed descriptions and documentation links. Each tool (Model Selector, Add Data, Web Search, etc.) now displays a more informative label and includes a detailed description explaining its functionality. This update makes it easier for users to understand and correctly configure tools when creating assistants.

January 2025

JAN 31 MODEL - Added OpenAI o3-mini model with 2M token context window

Added OpenAI’s o3-mini model, their latest small reasoning model optimized for science, math, and coding tasks. The model features a 2M token context window, supports streaming, batch API, structured outputs, and function calling. It maintains the same cost efficiency as o1-mini (0.0000044 USD per completion token, 0.0000011 USD per prompt token) while offering improved intelligence.

JAN 31 NEW - Add ability to assign and manage assistant categories

Added support for assigning and managing categories for assistants through the API. Users can now select multiple categories when creating or updating assistants, and categories are organized into groups. The update includes proper database relationships between assistants and categories, with the ability to view, assign, and modify category assignments while maintaining visibility settings.

JAN 31 NEW - Added deep linking and sharing for assistants

JAN 30 NEW - Added organization-wide model management capabilities

Organizations can now globally manage and monitor model usage across all spaces. Added new endpoints to view model status and configuration across spaces, including the ability to see which spaces are using specific models and whether models are globally enabled or disabled. This gives organization admins better visibility and control over model usage at the organization level.

JAN 30 IMPROVEMENT - Enhanced loading state for member details with skeleton animation

Improved the loading experience when viewing member details by replacing the basic ‘Loading’ text with an animated skeleton placeholder. The skeleton shows the expected layout with pulsing elements representing the member’s information fields, providing a smoother and more polished user experience.

JAN 29 MODEL - Added DeepSeek-R1-Distill-Llama-70B model on Groq platform

Added support for DeepSeek-R1-Distill-Llama-70B model on the Groq platform, featuring a 128K token context window. This fine-tuned version of Llama 3.3 70B excels at mathematical reasoning and coding tasks, and supports streaming, multiple completions (n), and penalties. The model is optimized for instant reasoning on GroqCloud™ with competitive pricing at

0.00079 per completion token and

0.00059 per prompt token.

JAN 29 IMPROVEMENT - Improved read-only mode for assistant editing interface

Enhanced the read-only state handling in the assistant editor by disabling all interactive elements when in read-only mode. This includes disabling the sharing controls, avatar input field, and edit tools button. The share visibility dropdown now shows a distinct disabled state with a sand-colored background, and the ‘Allow duplication’ checkbox respects the read-only state.

JAN 29 IMPROVEMENT - Improved assistant sharing interface with clearer visibility controls

JAN 29 FIX - Fixed model display width truncation in assistant tools editor

Fixed a UI issue where model names and provider logos could overflow their container in the assistant tools editor. Added minimum width constraints to prevent content from breaking layout when model names are long.

JAN 29 IMPROVEMENT - Default to smart model when selected model is unavailable

Enhanced model selection behavior to automatically fall back to the smart model when a previously configured model is not found. This improves reliability when editing assistants by preventing configuration errors due to unavailable models. Also reorganized the assistant editing interface to place Tools section after Persona & Writing Style for better UX flow.

JAN 29 FIX - Fixed image preview display for URLs containing special characters

Fixed an issue where image previews could break when image URLs contained special characters. The fix properly wraps image URLs in double quotes within the CSS background-image property, ensuring consistent display across all image sources.

JAN 29 IMPROVEMENT - Hide integrations section for free tier users

Integrations section (including Zapier and other AI widgets) is now hidden for free tier users, showing only a preview list of available integrations. Users with billing editor permissions will see an ‘Upgrade Now’ button to access these features. This change improves the clarity of premium features and provides a direct upgrade path for free tier users.

JAN 28 FIX - Fixed duplicate tools and human-in-loop functionality

Fixed an issue where tools were being added multiple times to assistants, causing duplicates in the available tools list. Also resolved a bug in human-in-loop functionality that was caused by inconsistent tool name formatting (with dashes vs underscores). The X (Twitter) post tool description has been updated to be more accurate.

JAN 28 MODEL - Added DeepSeek-R1 model and assistant categories support

Added DeepSeek-R1 model with 160K context window, available through Together.ai (

0.007/1K tokens) and Fireworks.ai (

0.008/1K tokens) providers. The model supports streaming, penalties, and multi-completion (n>1) capabilities, but does not support JSON mode or function calling. Also introduced a new categorization system for assistants with predefined groups like Marketing & Sales, Finance & Legal, Operations & HR, Engineering & Support, and Fun & Lifestyle.

JAN 28 IMPROVEMENT - Added DeepSeek-R1 model and refreshed landing page layout

Added support for DeepSeek-R1 model with a prominent promotional banner at the top of the landing page. The landing page has been redesigned with clearer sections for ‘Chat with AI’ and ‘Automate Tasks’, featuring more detailed descriptions and organized feature lists. The UI now emphasizes collaborative workspaces and task automation capabilities.

JAN 18 NEW - Enhanced chatbot widget configuration with plugins and auto-tools support

JAN 13 MODEL - Added OpenAI o1 model with 200K context and advanced reasoning capabilities

Added support for OpenAI’s o1 model, designed for complex reasoning with a 200K token context window. The model features built-in chain-of-thought processing and supports advanced capabilities including function calling, JSON output, vision tasks, and custom penalties. Token costs are set at

0.015/1K for prompt tokens and

0.060/1K for completion tokens.

JAN 03 IMPROVEMENT - Improved space selection dropdown positioning and visual design

December 2024

DEC 29 NEW - Add ability to filter custom data and documents by specific apps

Added support for filtering organization custom data and documents by specific app IDs. Users can now pass an optional list of app_ids to narrow down results to only show custom data and documents associated with particular applications. This filtering works in conjunction with existing filters like show_public_only, show_org_only, and search functionality.

DEC 29 NEW - Added data upload capabilities to space home page

Users can now upload data files directly from the space home page with support for custom file formats and webpage URLs. New features include a progress indicator during uploads, error handling for invalid file types, and ability to upload multiple files simultaneously. Supports file state tracking (UPLOADING, CREATED, PENDING, DELETING, QUEUED) with visual indicators for processing and error states.

DEC 27 FIX - Fixed document listing to show documents without versions

Fixed an issue where documents without any versions were not appearing in document lists. The improved query now correctly displays all documents, including those without versions, and properly handles documents with deleted versions by showing their most recent non-deleted version. The modified_on date now falls back to the document’s creation date when no versions exist.

DEC 27 IMPROVEMENT - Automatic document naming based on request content

Documents created from requests now automatically receive intelligent titles based on their content. The system uses an AI model to analyze the request’s response text and generate a relevant, concise title between 5-12 words that captures the main theme of the document. This improves document organization and searchability without requiring manual title input.

DEC 27 NEW - Add document type filtering to data table

Added new filtering options to the organization’s data table that allow users to filter content by document type. Users can now specifically show only documents or only custom data entries using the new ‘show_documents_only’ and ‘show_custom_data_only’ filter parameters. This enhancement improves content organization and navigation in the data table view.

DEC 23 NEW - Added documents to global data view alongside custom data

Documents are now displayed alongside custom data in the global data view, showing metadata like title, modification date, and version information. Users can view document details including associated apps, visibility status, and state through a new API endpoint (/documents/). This provides a unified view of both custom data files and documents within the organization.

DEC 20 IMPROVEMENT - Reduced success toast notification duration from 20 to 8 seconds

Success toast notifications now automatically dismiss after 8 seconds instead of 20 seconds, providing a more streamlined user experience while still ensuring messages are visible long enough to be read. This change affects all success notifications across the application while maintaining the 3-second duration for other toast types.

DEC 19 MODEL - Add experimental Gemini 2.0 Flash model with 1M token context window

Added support for Google’s experimental Gemini 2.0 Flash model, featuring a massive 1M token context window and multimodal capabilities. This next-generation model supports streaming, function calling, and vision tasks, with improved speed and native tool use. Pricing is set at

0.00000015 per prompt token and

0.0000006 per completion token.

DEC 19 FIX - Improved error message handling during model streaming responses

Enhanced error reporting when AI model streaming fails by showing the actual error message to users instead of a generic ‘an error occurred’ message. This provides more specific and helpful feedback about what went wrong during model interactions.

DEC 06 FIX - Fixed message width in chat interface

Fixed an issue where selectable chat messages could overflow their container width. Messages now properly constrain to their container size with improved layout behavior using the ‘min-w-0’ and ‘grow’ CSS properties, ensuring a better visual experience when hovering and selecting messages.

DEC 06 IMPROVEMENT - Added clickable space links in custom data details and member management

Enhanced navigation by making space names clickable in the custom data details and member management interfaces. Users can now directly navigate to spaces by clicking on space names instead of having to manually navigate there. Also improved security on external links by adding noopener/noreferrer attributes.

DEC 02 NEW - Added conversation comments and replies functionality

Users can now add, edit, and delete comments on conversations, as well as reply to existing comments. Comments support full CRUD operations with user-specific permissions - only comment authors or admins can modify/delete comments. Each comment tracks metadata including creation time, author, and deletion status.

November 2024

NOV 29 IMPROVEMENT - Auto-navigate to conversation after sending message from space home

Added automatic navigation to the conversation thread immediately after sending a new message from the space home page. When users submit a message through the ChatBox component on the space home screen, they will now be redirected to the full conversation view instead of staying on the home page.

NOV 27 NEW - Added support for MP4 video file uploads

Added support for uploading MP4 video files by including video/mp4 and video/x-mp4 MIME types to the list of valid file formats. This expands the platform’s media handling capabilities beyond audio formats like WAV and WebM.

NOV 27 FIX - Fixed UI overflow issues in saved prompts and file type displays

Fixed text overflow issues in two areas: saved prompts now properly truncate after two lines using line-clamp, and long file MIME types are now truncated with an ellipsis. This improves readability and prevents UI layout breaks when displaying long prompt text or file type information.

NOV 25 IMPROVEMENT - Upgrade button visibility and functionality improved for billing management

Enhanced the ‘Upgrade Now’ button functionality by adding click navigation to the billing page and restricting visibility to users with appropriate billing permissions (Admin ALL, Admin Billing, Editor ALL, or Editor Billing). The button appears for accounts using their upload quota (10 uploads for free tier, 100 for paid tier) and directs users to the organization billing page.

NOV 22 NEW - Added Hyper-Personalization with Smart Learning capabilities

Introduced Smart Learning feature that learns from highly-rated responses to personalize model selection. When users rate responses positively, the system now stores the prompt, model, and context in a vector database for future optimization. This personalization system automatically removes data from poorly rated responses to continuously improve recommendation accuracy.

NOV 22 IMPROVEMENT - Enhanced Space UI layout and performance improvements

Improved the Space interface by removing the ChatWrapper component for better performance and adding a maximum width limit (96) to space names in the navigation menu. Chat UI state now resets when switching between spaces, and console logging for image preview errors has been removed for cleaner debugging. Additionally, the chat interface is now wrapped in a ChatProvider component with proper app settings and ID context.

NOV 20 MODEL - Added Pixtral Large model and deprecated OctoAI models

Added Pixtral Large (124B) multimodal model from MistralAI with 128K context window support. The model excels at mathematical reasoning, document analysis, and visual data interpretation, with pricing at

0.006/1K tokens for completion and

0.002/1K tokens for prompts. All OctoAI models have been deprecated as of October 31, 2024. The new Pixtral model supports streaming, JSON output, function calling, and vision capabilities.

NOV 19 MODEL - Adjusted scoring penalty for Anthropic models

Modified the quality score adjustment for Anthropic models, reducing the penalty from 0.16 to 0.14. This change results in slightly higher quality scores for Anthropic models like Claude while maintaining the relative ranking with other providers.

NOV 19 NEW - Added message sharing functionality with email notifications

NOV 19 IMPROVEMENT - Simplified navigation breadcrumb structure in menu system

NOV 18 IMPROVEMENT - Enhanced organization member permission management

Added ability to edit individual organization member permissions via a new API endpoint. Organization admins and editors can now view individual member details and update permissions for specific members, while maintaining security by only allowing modification of permissions that the editor themselves has access to.

NOV 18 IMPROVEMENT - Remove test label from space name display in member permissions view

Removed unnecessary ‘test’ label that appeared below space names in the member details permissions interface. This improves the UI clarity by showing only the actual space name without redundant test text.

NOV 18 IMPROVEMENT - UI polish: Button styles, navigation menu, and member management updates

NOV 17 IMPROVEMENT - UI improvements for thread menu, chat renaming, and model comparison

NOV 16 MODEL - Adjusted model scoring for OpenAI and Anthropic models, added O1 model scores

Added scoring support for OpenAI’s O1-preview (score: 1.0) and O1-mini (score: 0.9) models. Implemented scoring adjustments that reduce OpenAI model scores by 0.3 and Anthropic model scores by 0.16, affecting model recommendations in the platform.

NOV 16 FIX - Fixed search to include custom data associated with labels

Fixed search functionality to properly include custom data when associated with labels by adding distinct query results and improving the join relationships between custom data tables and label tables. This ensures that searches now correctly return all relevant custom data entries that are connected to specific labels without duplicates.

NOV 16 IMPROVEMENT - Enhanced conversation title generation with clearer emoji-led formatting

Improved the automatic conversation naming system to generate more consistent and descriptive titles. Each conversation title now starts with a relevant emoji followed by a concise, grammatically correct heading under 50 characters. The system now follows a standardized format (e.g., ’🚀 Space Exploration Technologies’) to better capture conversation topics.

NOV 16 IMPROVEMENT - Enhanced split view layout with new file icons and playback controls

Added a comprehensive file icon system supporting 20+ data sources including Google Drive, Slack, Notion, and more. Introduced a new PlayRing icon and improved file visualization component that automatically renders the appropriate icon based on the content source. This update provides better visual context for different file types and sources in the interface.

NOV 16 FIX - Fixed conversations layout and improved display of participant avatars

Fixed layout issues in the conversations view by implementing fixed-width columns and proper text wrapping. Added proper handling for conversation participant avatars with a maximum width of 32 pixels and shrink prevention for action buttons. The conversation list now displays up to 10 conversations per page with improved spacing and border handling.

NOV 15 NEW - Added conversation read status tracking and saved-for-later functionality

Added two new features: (1) Conversation read status tracking that allows marking conversations as read with timestamps, and (2) A saved-for-later feature that lets users bookmark app requests for future reference. Also improved organization invites by preventing duplicate invitations and memberships through new database constraints.

NOV 07 FIX - Fixed conversation scrolling behavior in chat sidebar

Removed automatic scrolling behavior that was causing the conversation list to jump to the selected chat. Users can now naturally scroll through their conversation history without the view automatically repositioning to the active conversation.

October 2024

OCT 30 FIX - Fixed empty responses from OpenAI O1-class models due to token limit

Fixed an issue where O1-class OpenAI models (like GPT-3.5-Turbo and GPT-4) could return empty responses due to restrictive token limits. The maximum completion token limit has been increased to 32,000 tokens, allowing for much longer model responses.

OCT 29 FIX - Fixed message lookup in chat comparison view

Fixed an issue where messages couldn’t be properly compared in the chat comparison view due to message ID mismatches. The system now preserves the frontend-generated message ID using a new ‘_old_id’ field and uses it as a fallback lookup mechanism, ensuring messages can be correctly referenced and compared even after server synchronization.

OCT 24 IMPROVEMENT - Enhanced billing UI with improved pricing display and yearly subscription toggle

Improved the billing interface with a larger, more prominent yearly/monthly toggle switch and clearer pricing display. Prices now show per-seat monthly costs (e.g.,

28/seat/month for yearly Pro plan,

35/seat/month for monthly) with a visible 20% yearly discount. Added clearer billing cycle indicators and made the pricing toggle more user-friendly with clickable labels.

OCT 22 NEW - Added Collections feature to organize conversations and requests in spaces

Introduced a new Collections feature that allows users to organize and group conversations and requests within spaces. Users can create named collections with descriptions, add requests to collections, and manage collection items through new API endpoints. Collections are unique within an app and support search functionality.

OCT 09 IMPROVEMENT - Improved microphone permission handling in chat interface

Changed how microphone permissions are checked to be less intrusive by using the Permissions API instead of automatically starting a recording. Users will now only be prompted for microphone access when they actively try to record audio, rather than on page load. Also adds better error handling when microphone access is denied.

September 2024

SEP 30 NEW - Add support for microphone audio input with WebM format

Added support for audio input via microphone by enabling WebM audio format (audio/webm) and handling WebM video format conversion. Users can now record audio directly through their microphone in addition to uploading audio files. The system automatically handles format detection and conversion of WebM video formats to their audio equivalents for transcription.

SEP 30 MODEL - Added Llama 3.2 Vision and Text Models

Added four new models: Phi-3.5 Vision Instruct (32K context), Llama 3.2 11B Vision Instruct (131K context), Llama 3.2 90B Text Preview (8K context), and Llama 3.2 90B Vision Instruct Turbo (131K context). These models support various capabilities including vision processing, long-form text generation, and advanced reasoning. The vision models enable image captioning, visual question answering, and document analysis, while the text model excels at general knowledge, coding, and multilingual translation.

SEP 30 NEW - Added support for file attachments in apps and playground

Added support for uploading and attaching various file types to apps, including images (PNG, JPEG, WebP, SVG) and audio files (AAC, FLAC, MP3, M4A, WAV, etc). Files are securely stored with signed URLs and can be referenced in conversations. This feature requires a paid subscription and includes mime-type validation for security.

SEP 30 IMPROVEMENT - Chat submissions now require text input

Messages can now only be sent when text input is present, even if files are attached. Previously, messages could be sent with just attached files and no text. This change ensures more intentional message submissions by requiring users to provide text content along with any attachments.

SEP 30 IMPROVEMENT - Added loading states when switching between chat conversations

Improved the user experience when switching between conversations by adding visual loading indicators. Users will now see animated pulse placeholders for the conversation title and message history while content is loading. The message input is also automatically disabled during conversation switches to prevent premature submissions.

SEP 27 NEW - Added audio transcription support using Gemini and Groq

Added support for audio file transcription via Gemini and Groq integration. Users can now upload audio files in various formats (mp3, wav, flac, ogg, opus, etc.) and have them automatically transcribed. The system intelligently determines if transcription is needed based on conversation context and handles the transcription process in the background, supporting both streaming and non-streaming responses.

SEP 26 IMPROVEMENT - Simplified assistant tour tooltip with clearer explanation

SEP 25 FIX - Fixed message retry handling when API requests fail

Fixed an issue where failed API requests would incorrectly finalize message state, preventing proper retry attempts. The system now preserves the original message state when errors occur, allowing for proper retry functionality and more reliable error recovery.

SEP 24 NEW - Added plugins support for chat completions API endpoint

Added support for plugins in chat completion requests through a new ‘plugins’ field. Users can now specify a list of plugins to enable when making requests to the /chat/completions endpoint. This enhancement allows for dynamic plugin activation on a per-request basis.

SEP 24 NEW - Enhanced onboarding tour with image analysis and model comparison features

Updated the product tour to include comprehensive guidance for two major features: AI image analysis and model comparison. Users can now learn how to upload or paste images directly for AI analysis, with detailed instructions for combining text and image queries. The tour also demonstrates the new side-by-side model comparison feature, allowing users to compare responses from different AI models for the same prompt. Both features include step-by-step instructions and specific use cases.

SEP 24 NEW - Add clipboard image paste support and improve image upload UI

Users can now paste images directly from clipboard into chat conversations using Ctrl+V/Cmd+V, supporting PNG, SVG, and JPEG formats. The image upload interface has been enhanced with better error handling, visual feedback for invalid uploads, and improved UI elements including a border around image previews and better z-indexing for delete buttons. Images can be added both through file upload and clipboard paste, with toast notifications for error states.

SEP 23 NEW - Add support for multimodal prompts (images) and new Pixtral-12B vision model

Added support for image inputs in prompts and introduced the Pixtral-12B vision model from MistralAI with 128K context window. Updated vision capabilities for existing models including Claude-3 series, GPT-4 Turbo, and Gemini 1.5 models. The Pixtral-12B model supports streaming, JSON output, and function calling alongside its vision capabilities, with pricing at $0.0000015 per token.

SEP 23 FIX - Fix web search error handling to prevent failed searches

Fixed an issue where web searches could fail silently when search results were missing required text content. The system now properly logs and skips invalid results while continuing the search, preventing searches from breaking when incomplete data is returned.

SEP 23 IMPROVEMENT - Allow one datasource in Free plan

Free plan users can now add one datasource to their organization. Previously, Free plan users could not add any datasources. The playground prompt request endpoint has also been removed from datasource limit enforcement, making it more accessible. Error messages have been updated to be more informative, now specifically suggesting an upgrade to Pro plan when limits are exceeded.

SEP 23 FIX - Fixed UI height issues when notification banners are displayed

Fixed a layout issue where notification banners (Pro Plan, Subscription Expired, Version Check) caused incorrect content height calculations. The fix ensures proper content scaling and prevents overflow issues by adding min-height constraints to the main layout containers.

SEP 17 FIX - Fixed token handling for OpenAI models and label reference formatting

Fixed inconsistent token limit handling for OpenAI models by properly setting max_tokens parameter for non-O1 models and max_completion_tokens for O1 models (o1-mini and o1-preview). Also resolved a label reference formatting issue that affected how application labels were processed and returned in the API response.

SEP 16 FIX - Fixed file metadata handling and parameter validation in custom data sources

Fixed validation and handling of file metadata in custom data sources, now properly supporting both URLs and file paths. Updated parameter validation to make file size optional and improved handling of external URLs. Also fixed model validation of custom data references to properly parse database rows.

SEP 16 MODEL - Engine update with automatic tool-calling and expanded model function support

Added function-calling support for 18 major models including Claude 3 series (Opus, Sonnet, Haiku), GPT-4 Turbo variants, Mistral Large, Llama 3.1-70B, and Cohere Command-R models. Models can now automatically detect and execute tool-calling operations based on conversation context. Migration includes updates to conversation handling and API response formats to support the new capabilities.

SEP 16 NEW - Added fullscreen preview for generated AI artifacts

Added support for viewing and previewing AI-generated artifacts (images) in chat conversations. Users can now click on generated image thumbnails to open them in a fullscreen modal view with a dark overlay background. The preview includes a close button and supports clicking outside to dismiss.

SEP 13 MODEL - Added O1 series models: o1-preview and o1-mini

Added support for OpenAI’s O1 series models: o1-preview for complex reasoning tasks and o1-mini optimized for coding and math. Both models feature 128K context windows, support for function calling and JSON output, with October 2023 knowledge cutoff. The o1-preview model costs

0.060/1K completion tokens and

0.015/1K prompt tokens, while o1-mini offers more cost-efficient rates at

0.003/1K completion tokens and

0.012/1K prompt tokens.

SEP 13 IMPROVEMENT - Enhanced AI Chatbot widget with customizable styling and branding options

SEP 13 IMPROVEMENT - Assistant mentions now highlighted in orange in chat editor

Added distinct orange highlighting for assistant mentions in the chat editor interface. Assistant references now appear with an orange background (bg-orange-100) and orange border (border-orange-400), providing better visual differentiation from other mention types like data (purple), models (green), and other references (blue).

SEP 12 IMPROVEMENT - Added loading states and error handling to assistants page

Improved user experience on the assistants page by adding loading states with animated placeholders while content loads, and clear error messages when data fails to load. Popular and new assistants sections now show loading animations and handle error states gracefully with expandable error details.

SEP 12 IMPROVEMENT - Removed assistants from mention panel suggestions

Assistants are no longer displayed in the mention panel when typing ’@’ in Trypulze. While the assistant mention candidate type remains in the codebase, they are now filtered out from appearing as suggestions to users. This temporarily simplifies the mention experience by focusing on other reference types like labels, messages, and search options.

SEP 11 IMPROVEMENT - Enhanced version update notification banner appearance

Improved the appearance of the version update notification banner by removing an unnecessary space character and adjusting text spacing. The banner now shows a cleaner ‘A new version is available’ message with better-aligned underlined text for the update action.

SEP 11 IMPROVEMENT - Remove ellipsis from collapsed text previews

Improved the display of collapsed text previews by removing the automatically appended ellipsis (…) from truncated content. This change provides a cleaner interface since truncated text often already includes natural breaks or ellipsis when needed.

SEP 10 FIX - Simplified organization settings page with focus on core display information

Streamlined the organization settings update process by focusing on essential display information (organization display name and logo) rather than billing and address details. Removes validation of billing email domains and billing customer updates, making the settings page more focused and easier to use.

SEP 10 IMPROVEMENT - Added auto-update notification banner for new frontend versions

Added a new notification banner that automatically checks for frontend updates every 10 seconds. When a new version is available, users will see a blue banner with a clickable ‘refresh’ link to update their application to the latest version. This helps ensure users are always running the most recent version without manually checking for updates.

SEP 10 IMPROVEMENT - Enhanced table styling in markdown responses

Improved the visual appearance of tables in markdown responses by adding consistent borders, alternating row colors, and proper padding. Tables now feature light sand-colored backgrounds for even rows and headers, with uniform cell padding and border styling for better readability.

SEP 10 IMPROVEMENT - Enhanced assistant description handling with collapsible text and detailed view

Improved how long assistant descriptions are displayed by adding a collapsible text component that shows a ‘More/Less’ toggle when text exceeds 350 characters. Added a new assistant details dialog that automatically displays when descriptions are longer than 4 lines. Users can now easily read long assistant descriptions without cluttering the interface, with proper line break handling and formatted text display.

SEP 10 NEW - Added ServiceNow integration support

Added ServiceNow as a new integration option, including the ServiceNow logo asset and integration capabilities. This expands the available integrations alongside existing options like Salesforce, OneDrive, and Zendesk.

SEP 09 IMPROVEMENT - Added dynamic signed URL generation for generated artifacts

Improved handling of generated artifacts (like DALL-E images) by implementing a new dynamic URL signing system. Images are now stored in a dedicated artifacts bucket with organization and app-specific paths, and URLs are generated on-demand using a new ‘pulze://’ protocol. This change provides more secure and organized access to generated content while maintaining long-term accessibility.

SEP 09 IMPROVEMENT - Improve frontend layout and button interactions

Enhanced the thread assistant preview layout with a centered, width-constrained design (max-width: 3xl). Fixed button interaction behavior to properly handle blur effects when clicked. Also improved mention candidate type displays for image generation, file search, and web search references.

SEP 09 NEW - Added response rating system with thumbs up/down buttons

Added ability for users to rate responses using thumbs up and thumbs down buttons. This feature introduces new SVG icons (thumbs-up.svg and thumbs-down.svg) and enhances the Button component with automatic blur functionality after clicking. The rating system allows users to provide direct feedback on response quality.

SEP 09 IMPROVEMENT - Preserve marketing attribution data during login process

Enhanced user tracking by maintaining UTM parameters (like utm_source, utm_medium, utm_campaign) across the authentication flow. When users log in, their marketing attribution data is now preserved and automatically restored after successful authentication, ensuring accurate campaign tracking and analytics.

SEP 09 FIX - Fixed chat not updating when switching between different apps

Fixed an issue where the chat interface wouldn’t properly reset and re-render when switching between different applications (appIds). The fix ensures that the chat context and conversation state are properly reset by forcing a full re-render of the ChatProvider component when the appId changes.

SEP 09 IMPROVEMENT - Set minimum height for chat input box

Added a minimum height of 20 units to the editable chat input box while maintaining the maximum height of 40% viewport height. This improves usability by preventing the input area from collapsing too small when empty.

SEP 06 FIX - Improved file handling for custom data uploads and RAG integration

Fixed several issues with file handling in custom data management: improved URL-safe filename encoding for uploads, fixed file refresh logic for external URLs, and corrected RAG webhook processing for malformed objects. Now properly handles filenames with special characters and more reliably processes file refreshes for both local and external data sources.

SEP 06 IMPROVEMENT - Added GitHub Flavored Markdown (GFM) support for content rendering

Added support for GitHub Flavored Markdown through the remark-gfm v4.0.0 package. Users can now use advanced markdown features like tables, task lists, strikethrough text, autolinks, and footnotes when writing or viewing markdown content in the application.

SEP 05 IMPROVEMENT - Enhanced file processing reliability and filtering in RAG integration

Improved handling of files uploaded through RAG by adding better validation and filtering of malformed objects. Added support for handling missing file statistics gracefully and introduced new database indices for better performance. System now automatically removes invalid or malformed files and provides clearer error messages when files cannot be refreshed.

SEP 05 NEW - Added RAG integration for document management and data storage

Introduced RAG integration for enhanced document and data management capabilities. The update adds support for external file storage, URL management, and automatic sync tracking through new database fields like object_id and data_source_type. Users can now store external URLs and track document modifications with automatic timestamp updates.

SEP 05 IMPROVEMENT - Added DALL·E 3 image generation loading state and retry button

Enhanced the image generation experience by adding a dedicated loading state with spinner animation when DALL·E 3 is generating images. Users can now see a visual loading indicator in a square container, and after generation, they have the option to regenerate images using a new ‘Generate new image’ button. Analytics tracking for image generation attempts has also been added.

SEP 05 IMPROVEMENT - Added new sync failure status indicator in file preview

Added a new status indicator that shows ‘Syncing Failed’ in red text when a file encounters synchronization issues. This provides clearer feedback when file synchronization encounters problems, distinguishing sync failures from general processing failures.

SEP 05 IMPROVEMENT - Added important links to user menu dropdown

SEP 04 NEW - Added DALL-E 3 image generation plugin for chat completions

Added support for generating images using DALL-E 3 directly within chat completions. Users can trigger image generation using the ‘image-generation-dalle’ command, which creates 1024x1024 images with standard quality. Generated images are automatically stored in S3 with 7-day signed URLs for access.

SEP 04 IMPROVEMENT - Simplified mention candidate display text for image generations

Simplified the display text for image generation mentions in the editor, changing from ‘image-generation-dalle’ to just ‘image’. This makes the interface cleaner and more user-friendly when referencing DALL-E generated images in the editor.

SEP 04 FIX - Fix conversation continuation when parent message has an error

Fixed an issue where conversations would completely stop when encountering an error in a parent message. Now, the conversation can continue by automatically falling back to the previous valid message in the chat history, allowing users to keep chatting even if one message fails.

SEP 04 NEW - Added Integrations panel to chat interface

Added a new Integrations panel in the chat interface, accessible via a dedicated button with DataFlow icon. This feature is available exclusively for Pro plan subscribers and can be toggled from the sidebar. The panel maintains consistent dimensions (400px width) with other sidebar components like Members and Models panels.

SEP 04 FIX - Fixed model selection behavior when using dynamic model choice

Fixed an issue where selecting the dynamic/smart model option wasn’t properly handled in the assistant creation and editing interfaces. Now correctly sets the model_id to null (instead of undefined) when users choose the dynamic model option, ensuring proper backend compatibility and consistent model selection behavior.

SEP 03 FIX - Fixed tooltip behavior for Members panel in free tier

SEP 03 IMPROVEMENT - Added upgrade banner for free tier users

Added a new notification banner that encourages free tier users to upgrade to the Pro plan. The banner appears at the top of both mobile and desktop interfaces, displaying a clickable message that directs users to the billing page. Users can dismiss the banner by clicking the close icon, and it remains hidden until the next session.

SEP 03 IMPROVEMENT - Added Members panel to chat interface

Added a new Members panel to the chat interface, accessible via a dedicated button in the side navigation. Users can now toggle the Members panel visibility alongside existing Data and Models panels. The panel displays team member information and is designed with a consistent 400px width layout matching other side panels.

SEP 03 IMPROVEMENT - Enhanced assistant cloning with pre-populated fields

The Create Assistant dialog now supports cloning existing assistants by pre-populating all fields including name, avatar, description, visibility settings, persona, instructions, greeting, writing style, and model settings. This allows users to easily create new assistants based on existing ones while retaining all configuration options with the ability to modify them before creation.

SEP 03 NEW - Added space selection dropdown with improved UI and search functionality

SEP 03 IMPROVEMENT - Added 'make purge' command and optimized 'make clean' for faster local development

Introduced a new ‘make purge’ command that completely removes all Docker containers, images, volumes, and orphans for a full environment reset. The existing ‘make clean’ command has been optimized to only remove local images (instead of all images), making it significantly faster for routine cleanup while preserving cached images from external registries.

SEP 02 IMPROVEMENT - Improved assistant version cleanup and deletion behavior

Enhanced the assistant management system to automatically delete assistants when their last version is removed. When deleting assistant versions, the system now checks if any versions remain, and if not, automatically removes the parent assistant and associated favorites. This prevents orphaned assistant records and provides cleaner workspace management.

SEP 02 IMPROVEMENT - Added navigation link to assistant's owning space

SEP 02 NEW - Enhanced @ reference system with improved UI and guided tour

Added a new guided tour system that introduces users to the @ reference functionality and improved the mention panel UI. The mention panel now appears as a portal with better positioning and increased z-index (110) for improved visibility. The maximum width has been set to 650px for better readability, and the panel includes enhanced tooltips explaining space organization and data security features.

SEP 02 NEW - Added Make.com and Zapier integrations for workflow automation

Added official integrations with Make.com and Zapier, enabling users to create automated workflows with Pulze.ai. Users can now generate API keys specifically for these platforms and connect to hundreds of other apps and services. The integration includes direct invitation links to both Make.com and Zapier platforms for seamless setup.

SEP 01 IMPROVEMENT - Increased default maximum token limit to 4096 tokens

Increased the default maximum token limit from 2048 to 4096 tokens across the platform, including assistant versions and model settings. This change allows for longer conversations and completions by default without requiring manual configuration. The update affects both new assistants and the default behavior when max_tokens is not explicitly specified.

August 2024

AUG 30 MODEL - Updated Cohere Command-R and Command-R Plus models with new pricing

Updated Cohere Command-R Plus to the newer 08-2024 version with adjusted pricing (2.5e-6 per prompt token, 10e-6 per completion token). Command-R model was also updated to version 08-2024 with new pricing (0.15e-6 per prompt token, 0.6e-6 per completion token) and had its deprecation status removed.

AUG 30 FIX - Fixed input field width consistency across all dialogs

Standardized the width of input fields across all dialog boxes, including organization creation, space creation/renaming, chat renaming, label editing, and member invitation forms. Replaced fixed minimum widths (300px) with responsive full-width inputs that automatically adjust to their container size, providing a more consistent and adaptable user interface.

AUG 30 IMPROVEMENT - Limit assistant descriptions to 4 lines in mention suggestions

Improved readability of assistant descriptions in the mention suggestions dropdown by limiting them to a maximum of 3 lines of text plus the name line. This prevents long descriptions from taking up too much space while still providing relevant information.

AUG 29 IMPROVEMENT - Enhanced assistant list sorting and default assistant visibility

Improved the assistant list UI to better highlight default and favorite assistants. Added ‘current_app_default’ flag to clearly indicate which assistant is set as default for the current app, and enhanced sorting to prioritize default assistants followed by favorites. Also improved shared assistant visibility with better annotation handling.

AUG 29 IMPROVEMENT - Enhanced Assistants UI with improved organization and ownership context

Improved the Assistants interface by adding organization ownership context and refined sorting capabilities. Assistants now display their owner organization’s name and ID, with clearer indicators for whether an assistant belongs to the current organization or app. The sorting algorithm has been enhanced to better handle popular assistants based on favorites and modified dates.

AUG 29 MODEL - Added Gemini 1.5 Flash Experimental and updated Gemini model pricing

Added the experimental Gemini 1.5 Flash model with 1M token context window, supporting streaming but not JSON, functions, or penalties. Updated Gemini 1.5 Pro Experimental (formerly Pro-exp-0801) and adjusted token pricing for Gemini 1.5 Flash models to 0.000075¢ per prompt token and 0.0003¢ per completion token.

AUG 29 IMPROVEMENT - Enhanced assistant annotations and improved assistant list visibility

Assistants now display additional metadata including favorite status, app ownership, and organization ownership. List views have been updated to better organize assistants by showing favorites first, followed by app-owned assistants, then organization-owned assistants, and finally public assistants. Also improved assistant reference handling for access tokens with proper auth0_id filtering.

AUG 29 IMPROVEMENT - Updated descriptions for popular and new assistants sections

Improved the descriptive text in the Assistants page with more engaging and clear section subtitles. The ‘Popular’ section now reads ‘Our community’s favorite assistants. ❤️’ and the ‘New and updated’ section displays ‘Freshly baked assistants, hot off the press.’

AUG 29 IMPROVEMENT - Move model whitelist controls to side panel for easier access

Relocated the model whitelist management to a dedicated side panel, making it easier to select and customize which AI models are available for your conversations. Added a new cube icon for model management and improved the model display component to show clearer provider logos and names. This change includes a new onboarding tour step to help users discover and understand the model whitelist feature.

AUG 28 IMPROVEMENT - Added sorting options for shared assistants page

Added new view parameter to sort shared assistants by either ‘newest’ (default) or ‘popular’. Popular view sorts assistants by the number of favorites, while newest view sorts by the most recently modified published version. Both sorting options maintain alphabetical ordering by name as a secondary sort.

AUG 26 IMPROVEMENT - Improved assistant list ordering with favorites and organizational context

Enhanced the ordering of assistants in list views to show a more logical and user-friendly sequence. Assistants are now ordered by priority: favorites appear first, followed by assistants owned by the current app, then assistants shared within the organization, and finally publicly shared assistants. Within each priority level, assistants are sorted alphabetically by name.

AUG 26 NEW - Added Assistants feature with versioning and configuration management

Introduced a new Assistants feature that allows creating and managing AI assistants with versioning support. Users can now configure assistants with custom personas, instructions, greetings, writing styles, and sample interactions. Each assistant can be published with specific model settings (temperature, max tokens) and visibility controls. Added support for favoriting assistants and storing conversation-specific settings.

AUG 22 MODEL - Added AI21's Jamba 1.5 Large model and deprecated Jamba Instruct

Added AI21 Labs’ Jamba 1.5 Large model, featuring a 256K context window and support for function calling, JSON output, and streaming. This state-of-the-art hybrid SSM-Transformer model offers up to 2.5X faster inference than comparable models and is optimized for business use cases. The older Jamba Instruct model has been deprecated.

AUG 21 FIX - Added error notification when space creation fails

Added error handling to display a toast notification when creating a new space fails. Users will now see a visible error message ‘Creating space failed!’ instead of a silent failure, making it clearer when there’s an issue during space creation.

AUG 20 IMPROVEMENT - Enhanced search functionality to filter data by label names

Search functionality has been expanded to look for matches in both file names and label names when filtering custom data. Users can now find their data by searching for either the file name or any associated label names, making data discovery more flexible and comprehensive.

AUG 20 IMPROVEMENT - Removed legacy billing system enforcement and simplified subscription checks

Simplified the organization subscription validation process by removing the legacy billing system and trial balance enforcement. Users will now receive clearer subscription status messages, and the billing information endpoint has been streamlined to support the new subscription model. This change affects how subscription status is checked and displayed but does not impact actual subscription features or limits.

AUG 20 NEW - Enhanced API Access Token System with Granular Permissions

Introduced a new access token management system for API authentication, replacing the legacy API key pattern. Users can now create and manage multiple access tokens per app with granular permissions through new endpoints ‘/access-tokens’. The system includes automatic token generation during app creation and the ability to list all tokens for an app, with improved security through permission-based access controls.

AUG 20 IMPROVEMENT - Conversations now sort by last modified date within their time groups

Enhanced conversation organization by sorting threads by most recently modified within their respective time groups (Today, Yesterday, Last Week, This Year, and Older). This improvement ensures the most recently active conversations appear first in each section, making it easier to find recent discussions.

AUG 20 IMPROVEMENT - Add dismiss button to onboarding tour prompt

Added ability to dismiss the ‘Start Tour’ prompt box that appears for new users in their first workspace. Users can now either start the guided tour or close the prompt using a new dismiss button in the top-right corner. The tour state is properly reset when dismissed, preventing the prompt from reappearing.

AUG 19 NEW - Added guided product tour using react-joyride

Added an interactive product tour feature using react-joyride v2.8.2 to help new users learn the platform’s interface and features. This introduces step-by-step guidance and tooltips that can walk users through key functionality of the application.

AUG 19 IMPROVEMENT - Chat threads now ordered by last modified date in sidebar

Chat conversations in the sidebar are now sorted by when they were last modified instead of creation date. This change ensures your most recently active conversations appear at the top, making it easier to find and resume recent chats.

AUG 17 IMPROVEMENT - Enhanced analytics tracking for user interactions and payment confirmations

Added comprehensive event tracking across the application using Google Analytics and Twitter Pixel. Users’ interactions are now tracked for actions including copying code blocks, switching organizations, creating new organizations, sending chat messages, using reference systems (@), and upgrading plans. Also added Twitter conversion tracking to better measure payment confirmations and sign-ups.

AUG 15 IMPROVEMENT - Added subscription enforcement and usage tracking for API endpoints

Implemented subscription-based access control across key API endpoints including app creation, chat completions, playground access, and organization management. Users now have daily request limits based on their subscription tier (Free, Pro, etc.) with accurate usage tracking. Organizations are limited to one free tier per billing email, and request counts are tracked per organization on a daily basis.

AUG 14 FIX - Fixed empty system prompts handling in model requests

Fixed a bug where empty system prompts were being passed to language models unnecessarily. The system now only injects the system prompt from space settings when instructions are actually present and not empty. This change also removes the automatic inheritance of max_tokens and temperature from app settings in playground requests.

AUG 13 NEW - Added new Spaces subscription management system with tiered billing

Introduced a new billing system for Spaces that supports tiered subscriptions with configurable seats (1-100 per organization). Organizations can now manage their Spaces subscription through a new billing portal, including features like automatic tax handling, subscription upgrades/downgrades, and customizable seat quantities. The system integrates with Stripe for payment processing and includes automatic subscription status tracking.

AUG 08 NEW - Added CSV file support with HTML table rendering for RAG processing

Added support for CSV file ingestion in the RAG system. CSV files are now automatically converted to HTML table format during processing, with each row rendered as a complete table containing both headers and values. This enables better structured data representation and improves retrieval accuracy for tabular information.

AUG 07 IMPROVEMENT - Enhanced web scraping reliability by following HTTP redirects

Improved the web search and URL ingestion functionality to automatically follow HTTP redirects when fetching content. This ensures that URLs which have moved or redirect to other locations will be properly scraped instead of failing, making the RAG system more robust when processing web content.

AUG 06 IMPROVEMENT - Space settings now automatically apply to completion requests

Completions now inherit default values from space settings when not explicitly provided in the request. This includes automatically applying the space’s max tokens, temperature, and system instructions. Previously, these settings had to be manually specified in each request.

AUG 06 MODEL - Added GPT-4o (2024-08-06) model with 128K context window

Added GPT-4o-2024-08-06, a new OpenAI model variant with 128,000 token context window that supports structured outputs, streaming, function calling, and JSON mode. The model features improved token pricing at

0.015/1K tokens for completion and

0.005/1K tokens for prompts. This model version specializes in larger output token counts compared to previous versions.

AUG 06 IMPROVEMENT - Enhanced space management permissions for organization admins

Organization administrators can now manage members in spaces regardless of their space-specific permissions. This includes editing permissions and removing members, even if they are listed as the current user in that space. A new blue info banner indicates when an org admin is managing a space with elevated permissions.

AUG 01 MODEL - Add Gemini 1.5 Pro Experimental model and update context windows

Added Google’s experimental Gemini 1.5 Pro (gemini-1.5-pro-exp-0801) model with a 2M token context window. The model supports streaming and chat capabilities, with token costs of

0.00350/1K prompt tokens and

0.0105/1K completion tokens. Also updated the context window size to 2M tokens for existing gemini-1.5-pro and gemini-1.5-pro-001 models.

AUG 01 NEW - Added Space Widget permissions and endpoints for TryPulze.com integration

July 2024

JUL 31 NEW - Added programmatic API access tokens with granular permissions

Introduced a new access token system allowing programmatic access to apps with fine-grained permissions. Applications can now create and manage access tokens with specific permissions like data retrieval, conversation creation, and custom data management. This includes a new endpoint for retrieving token permissions and enhanced security through hashed token storage.

JUL 31 NEW - Added automatic system instruction injection from app settings

Chat completion requests now automatically inject system instructions from app settings when no system message is provided. This allows organizations to set default system prompts at the app level that will be prepended to all chat conversations that don’t explicitly include a system message, ensuring consistent behavior across interactions.

JUL 30 IMPROVEMENT - Updated permission requirements for deleting conversations

Changed permission level required for deleting conversations from Viewer to Editor access. Users now need Editor-level permissions to delete conversations within an app, providing better access control and data security.

JUL 30 IMPROVEMENT - Added user permission visibility in app settings endpoint

Enhanced the app settings API to include current user permissions when retrieving app details. Users can now see their specific access levels (admin or custom permissions) within each app space. This change improves permission transparency and helps users understand their access rights within the platform.

JUL 30 FIX - Fixed organization-level app permission checks

Fixed a permissions issue where organization admins were being required to have full admin permissions instead of just app admin permissions to access applications. Organization users with admin:app permissions can now properly access all apps within their organization without requiring broader admin privileges.

JUL 30 IMPROVEMENT - Improved error handling when accessing unauthorized or non-existent Spaces

Added a user-friendly error state with visual feedback when users attempt to access a Space they don’t have permission for or that doesn’t exist. The new UI shows a frown face icon and clear error message explaining the access issue, replacing the previous basic loading state. Also added loading skeletons for conversations and better error handling throughout the Space access flow.

JUL 30 IMPROVEMENT - Added guidance when no users are available to add to a Space

Added a helpful explanation and step-by-step instructions when attempting to add members to a Space with no available users. The new interface explains that users must first be invited to the organization before they can be added to a Space, and provides clear instructions for both inviting organization members and adding them to Spaces. Also improved the spacing of the permissions selection badge.

JUL 29 MODEL - Deprecation scheduled for Fireworks' Qwen2-72B Instruct model

The Fireworks AI Qwen2-72B Instruct model (fireworks/qwen2-72b-instruct) will be deprecated on August 12, 2024. Users should plan to transition to alternative models before this date.

JUL 29 IMPROVEMENT - Improved error message for model context window limitations

Added clearer error messages when a requested model’s context window is too small for the input. Instead of a generic ‘not allowed’ message, users now receive a specific message indicating that their request is too large for the model’s context window, helping them better understand and resolve the issue.

JUL 29 IMPROVEMENT - Improved organization name handling with better defaults

Organizations now get better default names when created: empty organization names are automatically set to ‘org-’, and empty display names default to ‘My Organization’. This improves the initial setup experience and ensures organizations always have meaningful identifiers. The changes apply to both new organizations and retroactively updates existing organizations with empty names.

JUL 29 IMPROVEMENT - Simplified organization creation process by removing Hubspot synchronization

Streamlined the organization creation workflow by removing automatic Hubspot synchronization that previously ran in the background. Organizations are now created faster with the same core functionality including Stripe customer creation and trial balance setup. The change maintains existing features like currency matching and free trial balance allocation.

JUL 29 IMPROVEMENT - Automatically create first workspace when setting up a personal organization

When users create a new personal organization, the system now automatically creates their first workspace named ‘My First Space’. This improvement streamlines the onboarding experience by eliminating the need for users to manually create their first workspace after organization setup.

JUL 29 IMPROVEMENT - Added default user avatar icon for messages without Auth0 ID

Enhanced the chat interface by adding a new default user avatar icon that appears when messages don’t have an associated Auth0 ID. The placeholder now shows a person icon in a white circular border instead of the previous empty sand-colored background, improving visual consistency and user recognition in chat conversations.

JUL 29 IMPROVEMENT - Automatically update URL when starting new conversations

Improved chat navigation by automatically synchronizing the URL with the conversation ID when sending the first message in a chat. When users start a new conversation, the browser URL will now update to reflect the correct conversation ID, making it easier to share or bookmark specific chat sessions.

JUL 27 IMPROVEMENT - Increased token estimation buffer from 1.5% to 15%

Increased the token estimation safety margin from 1.5% to 15% when checking if prompts fit within a model’s context window. This helps prevent failed requests by being more conservative in estimating token counts, especially for longer prompts and complex content.

JUL 27 FIX - Show all workspace spaces by default instead of only accessible ones

Changed the default workspace spaces view to show all available spaces instead of only showing accessible spaces. This fixes an issue where the first space wouldn’t be created properly due to filtering restrictions. Users will now see all workspace spaces by default when accessing the spaces page.

JUL 26 FIX - Fixed app visibility filtering and permissions handling in shared organizations

Fixed an issue with app visibility filters where users couldn’t properly see accessible apps in shared organizations. Organization admins now correctly see all apps, while other users only see apps they have explicit permissions for. Also improved permission inheritance logic where org-level admin permissions now properly cascade to all apps within the organization.

JUL 26 NEW - Added deep linking support for conversation threads

Added the ability to link directly to specific conversation threads through parent_request_id. The update includes database changes to track conversation relationships and auth0_id in requests, enabling better thread navigation and history tracking. Conversations now also automatically include system instructions from app settings when starting a new thread.

JUL 26 IMPROVEMENT - Added space accessibility indicator and filter

Added visual indicators and filtering options to show which spaces are accessible to users. Spaces without access permissions are now greyed out and non-clickable, while accessible spaces remain interactive. Users can filter spaces using a new ‘Accessible’ toggle button alongside the existing ‘Mine’ filter. This improves visibility of space permissions and helps users quickly find spaces they can access.

JUL 25 MODEL - Added new Fireworks models and deprecated legacy models

Added 5 new Fireworks AI models: Llama 3 70B (8K context), Mixtral 8x22B (65K context), Qwen2 72B (32K context), and Llama 3.1 70B/405B (128K context). All new models support streaming, JSON mode, and function calling. Several legacy models have been deprecated: Mixtral 8x7B, Command-R, Claude 3 Sonnet, Gemini Pro, GPT-3.5 Turbo, and Mistral Medium/Small.

JUL 24 MODEL - Added Mistral's latest models including Mistral Large 2.0 and Mixtral models

Added four new Mistral AI models: Mistral Large 2.0 (128K context), Mistral Large Latest (128K context), Mixtral 8x7B Instruct (32K context), and Mixtral 8x22B Instruct (65K context). All models support streaming and JSON output, with Mistral Large and 8x22B also supporting function calling. Pricing varies from 0.7µ¢ to 9µ¢ per token, with separate prompt and completion costs.

JUL 24 FIX - Fixed filename display truncation in message sources dialog

Fixed an issue where filenames were being truncated in the message sources dialog header. Long filenames will now display in full, improving readability when viewing source documents and files.

JUL 23 MODEL - Switch to gpt-4o-mini for local development environment

Changed the default model for local development from llama-3-70b-instruct to gpt-4o-mini, with Claude-3-haiku as a secondary option. Added latency metrics for both models with gpt-4o-mini averaging 1.1ms per token (p50) and Claude-3-haiku at 0.1ms per token (p50). This change optimizes local development performance and model selection.

JUL 23 MODEL - Add Llama 3.1 models (70B and 405B) with 128K context window

Added Meta’s latest Llama 3.1 models via Together and OctoAI providers: llama-3.1-70b-instruct and llama-3.1-405b-instruct. Both models feature a 128K token context window, support for 8 languages (English, French, German, Hindi, Italian, Portuguese, Spanish, Thai), and function calling capabilities. The models include improvements in understanding and instruction following compared to Llama 3.0, with the 405B variant being positioned as the most powerful open-source LLM for complex use cases.

JUL 19 MODEL - Promoted GPT-4o-mini as default model, deprecating GPT-3.5 Turbo and Cohere Command-R

GPT-4o-mini is now promoted as the default active model with a score of 1254, replacing GPT-3.5 Turbo and Cohere Command-R as default options. All existing model settings using GPT-3.5 Turbo have been automatically migrated to GPT-4o-mini, and Cohere Command-R settings have been removed.

JUL 18 MODEL - Added GPT-4o mini: OpenAI's new multimodal compact model

Added support for GPT-4o mini (OpenAI’s most advanced small model) with 128K context window. This new multimodal model accepts both text and image inputs, offering higher intelligence than GPT-3.5-turbo at similar speed. Features include function calling, streaming, JSON mode, and vision capabilities, with extremely cost-effective pricing at

0.00015/1K prompt tokens and

0.0006/1K completion tokens.

JUL 18 MODEL - Updated router model scores to pulze-wildbench-v0.1-20240718

Updated the router’s model scoring system to use the latest pulze-wildbench benchmark version dated July 18, 2024 (previously July 10, 2024). This update applies to both development and production environments and may affect how the router selects and ranks AI models for requests based on updated performance benchmarks.

JUL 17 NEW - Added file labeling system for custom data files

Introduced a new file labeling system allowing users to create, manage, and assign colored labels to custom data files. Labels can be created with names (up to 32 characters), descriptions, and custom colors (hex format). Users can bulk update file labels and filter/search files by labels through the new API endpoints.

JUL 16 MODEL - Add Gemma-2-27B-IT model and update Qwen2-72B model name

Added Google’s Gemma-2-27B-IT model with 8,192 token context window, available through Together AI. This lightweight model supports streaming, penalties, and multiple outputs (n>1). Also updated Qwen2-72B-Chat to be correctly named as Qwen2-72B-Instruct. Token costs are set at $0.0000008 per token for both prompt and completion.

JUL 13 NEW - Enable fine-grained app-level RBAC permissions by default

Fine-grained Role-Based Access Control (RBAC) for applications is now enabled by default for all environments. This enhances security by requiring explicit viewer, editor, or admin permissions at both the organization and app levels. Users must have at least viewer:app permissions at the organization level to perform any app-related actions.

JUL 12 IMPROVEMENT - Enhanced model failover chain behavior and model whitelisting

Improved the model failover chain system with smarter handling of model activation/deactivation. When disabling models, the failover chain is now automatically reset only when necessary, maintaining chain configurations when possible. The system also better handles model candidate selection for apps, with more efficient model whitelisting and validation checks.

JUL 11 IMPROVEMENT - Enhanced @-references panel with better loading states and empty state messages

Improved the @-references panel with clearer loading states, error handling, and helpful placeholder text. Users now see a ‘Start typing to show more relevant suggestions’ prompt in the search box, and get clear feedback when no results are found. The panel also maintains previous search results while loading new ones for a smoother experience.

JUL 11 FIX - Fix: Prevent sending empty messages in chat

Fixed an issue where empty messages could be sent by pressing Enter when the editor contained only whitespace. The editor now properly checks for empty content by examining all nodes and their text content, preventing submission of blank messages.

JUL 10 MODEL - Added Cohere Command-R model with default active status

Added Cohere’s Command-R model to the available model roster and set it as default active. Based on the model scores fixture, Command-R is positioned with a 0.8 routing score alongside Claude 3 Haiku (0.9) and Llama 3 70B Instruct (1.0), making it available for automatic model routing.

JUL 10 IMPROVEMENT - Enhanced RAG query handling with better search results organization

Improved the handling of RAG (Retrieval Augmented Generation) queries by collecting all search results before processing and introducing a new sorting mechanism. The system now sorts search results by relevance score and includes a new RAG query rewrite plugin that formats retrieved content with proper citations and timestamps. This change enables more comprehensive document retrieval and better organized responses when querying multiple documents.

JUL 10 IMPROVEMENT - Model responses now grouped by prompt in comparison view

Improved the model comparison interface to only show responses from the same prompt when switching between models. The ModelSwitcher now filters responses based on matching prompts, and comparison views are restricted to responses generated from identical prompts. Added a readOnly state for pinned messages to prevent model switching.

JUL 10 IMPROVEMENT - Enhanced spaces menu with loading states and search improvements

JUL 10 FIX - Fixed model comparison selection logic

Fixed an issue where model comparisons weren’t correctly selecting the next response for comparison. The logic now compares responses based on their unique IDs instead of model names, ensuring more accurate response comparisons across different chat messages.

JUL 09 MODEL - Major model updates: New models added and legacy versions deprecated

Added 2 new models: AI21’s Jamba Instruct (256K context) and OctoAI’s WizardLM-2-8x22B (65K context). Deprecated numerous legacy models across providers including AI21 Labs (J2 series), Anthropic (Claude 2.x, Claude Instant), OpenAI (older GPT-4 and GPT-3.5 versions), Google (Gemini 1.0, Bison), and Together AI’s previous model versions. New default active models include Claude 3 Sonnet/Haiku, Gemini 1.5 Pro/Flash, and various LLaMA-3 70B implementations.

JUL 09 FIX - Fixed RAG query handling and improved prompt templates

Fixed retrieval-augmented generation (RAG) by sanitizing search queries to remove special tokens and improving response accuracy. Also streamlined system prompts by removing redundant language detection instruction and simplifying the template structure. These changes improve the reliability and accuracy of responses when using both file and web search capabilities.

JUL 09 NEW - Added ability to delete conversation threads

Users can now delete conversation threads from their applications. This implements a soft-delete functionality where threads are marked as deleted but preserved in the database. Deleted threads are automatically filtered out from conversation queries and can’t be accessed after deletion.

JUL 09 FIX - Fix plugins not persisting when re-running prompts

Fixed an issue where file search and web search plugin settings weren’t being preserved when re-running previous prompts. The system now correctly maintains the original plugin configuration (file search and web search) when users click the re-run button on existing messages.

JUL 09 IMPROVEMENT - Enhanced error handling and display in chat interface

Improved error message display with a new collapsible error component that shows detailed error information. Users can now click to expand/collapse error details, and the system provides more specific error messages instead of generic failures. Also improved handling of stream response errors and authentication token expiration.

JUL 09 IMPROVEMENT - Added retry logic when fetching chat log entries

Improved chat reliability by adding automatic retry functionality when fetching log entries. The system now makes up to 5 retry attempts with 500ms delays between attempts if the initial fetch fails, reducing interruptions in chat conversations when experiencing temporary network issues.

JUL 09 IMPROVEMENT - Added configurable auto-focus behavior for search inputs

Search inputs now support an optional autoFocus property to control automatic focusing behavior. By default, search inputs will automatically focus and scroll into view, but this can now be disabled by setting autoFocus=false. This improves user experience in scenarios like the Spaces Menu where automatic focus is undesirable. Additionally, the mention panel now supports Tab key for selection and displays loading/empty states more clearly.

JUL 09 IMPROVEMENT - Always show prompt controls regardless of error state

Modified chat interface to consistently display prompt controls even when there are response errors. Previously, prompt controls were hidden when an error occurred, limiting user interaction options. This change improves the user experience by maintaining access to prompt actions regardless of the response status.

JUL 09 FIX - Fix plugin settings persistence when comparing model responses

Fixed an issue where file search and web search plugin settings weren’t preserved when comparing responses between different models. The system now correctly maintains the original message’s plugin configuration (file search and web search settings) when generating comparison responses.

JUL 09 NEW - Added per-message plugin selection for web search and file search

Users can now enable web search and file search plugins individually for each message via the @ menu. Each plugin appears as a removable chip above the message editor, allowing users to toggle them on/off per conversation turn. The plugins’ state is preserved and initialized based on the message context.

JUL 09 IMPROVEMENT - Integrated file and web search options into filtering system

Improved the mention panel interface by integrating file search and web search capabilities directly into the filtering system. Previously, these options were handled separately through showAllFiles and showWebSearch flags. The update streamlines the search experience by consolidating all search types (data, file, and web) into a unified filtering interface with better keyboard navigation and selection handling.

JUL 07 IMPROVEMENT - Enhanced language matching instructions for query responses

Improved prompt template instructions for language matching by making the language detection requirement more explicit. The system now specifically instructs to identify the primary language of the query before responding, ensuring more consistent same-language responses across all interactions.

JUL 05 IMPROVEMENT - RAG responses now match query language for better multilingual support

Improved RAG (Retrieval Augmented Generation) functionality to ensure responses are always provided in the same language as the user’s query. This enhancement enables more natural multilingual interactions by enforcing language consistency between questions and answers in RAG-based conversations.

JUL 05 IMPROVEMENT - Improved citation formatting and prompt engineering for better responses

Updated citation format to appear after sentences rather than immediately after words, making responses more readable. Added virtual whiteboard instruction to help models structure their thoughts, and simplified source template formatting. Citation examples now consistently show citations after complete sentences with proper spacing.

JUL 05 IMPROVEMENT - Enhanced prompt engineering for better citation formatting and readability

Improved the citation system with better formatting and clearer instructions for AI responses. Citations are now displayed with proper markdown formatting, multi-line text is properly quoted, and citation numbers are ordered more logically. The prompt instructions have been refined to ensure more consistent and accurate source citations with explicit examples and stricter formatting rules.

JUL 05 IMPROVEMENT - Updated hover and active states in mention selector UI

Updated the visual feedback when hovering over and selecting mention candidates in the editor interface. Changed the background color from the previous row-hover style to a sand-200 color variant, providing more consistent visual feedback across the application.

JUL 04 IMPROVEMENT - Added keyboard navigation support for mention suggestions

JUL 03 IMPROVEMENT - Added timestamps to RAG query responses

Augmented RAG (Retrieval-Augmented Generation) responses with UTC timestamps to provide temporal context for each query. This helps users understand when information was retrieved and processed, especially important for time-sensitive queries or when referencing dynamic content.

JUL 03 IMPROVEMENT - Simplified file and URL pattern syntax in prompts

Simplified the syntax for file and URL references in prompts by removing the ’@’ prefix from pattern matching. File and URL references now use the format ‘<!file:uuid:name>’ or ‘<!url:uuid:name>’ instead of ‘@<!file:uuid:name>’ or ‘@<!url:uuid:name>’.

JUL 03 IMPROVEMENT - Model reference list now shows only active models for each application

Improved the model reference list to show only actively configured models for each application, removing inactive or unconfigured models from the results. This change simplifies the model selection experience by displaying only relevant, non-deprecated models that are specifically configured for your application.

JUL 03 NEW - Added RAG improvements and @-reference plugin support

Added support for @-reference plugins and improved RAG (Retrieval Augmented Generation) functionality through a new plugin system. Users can now list references for apps via a new /references endpoint that returns both custom data and model references. The RAG system has been restructured to use separate file search and web search services, providing more flexible document retrieval capabilities.

June 2024

JUN 27 IMPROVEMENT - Improved streaming response handling for playground requests

Enhanced the playground API to automatically handle streaming responses based on model capabilities. The system now always attempts to use streaming mode first, and gracefully falls back to non-streaming when a model doesn’t support it, providing a more consistent experience across different providers. This eliminates unnecessary streaming-related errors and removes provider-specific streaming restrictions.

JUN 26 IMPROVEMENT - Made date separators sticky in chat view

Enhanced chat message readability by making date separator headers sticky at the top while scrolling through messages. Date markers showing ‘MM/DD/YYYY’ now remain visible at the top of the viewport, making it easier to maintain context when viewing long chat histories.

JUN 24 IMPROVEMENT - Updated trial period settings and initial balance for new organizations

Modified trial settings to provide

50 USD in free credits (up from

20) for new organizations. Production trial period extended to 90 days (up from 21), while test environment trials reduced to 3 days. Trial credits are now only automatically added in local development environments.

JUN 24 IMPROVEMENT - Improved HTML parsing and expanded initial search results for better reranking

Enhanced the document retrieval system by increasing the initial search results from 5x the requested amount to 200 documents before reranking, improving the quality of final results. Updated HTML and PDF parsing to preserve structured content (tables, lists) by extracting both plain text and HTML metadata, ensuring richer context for search queries. Also added a proper user agent identifier (PulzeBot/1.0) for web crawling operations.

JUN 21 IMPROVEMENT - Added email addresses to app members list and sorted by name

Users can now see email addresses for all app members and member candidates in the app management interface. The members and candidates lists are now automatically sorted alphabetically by name for better organization. This update improves user identification and list navigation when managing app permissions.

JUN 21 IMPROVEMENT - Enhanced app member management with additional user information and candidate listing

Improved the app member management interface by adding detailed user information (name, profile picture) to member listings and a new endpoint to view potential members who can be added to an app. Also added member preview functionality to show up to 5 member profile pictures in app listings and added permissions visibility for the current user in app views.

JUN 21 IMPROVEMENT - Improved code block rendering in markdown with better language detection

Enhanced markdown code block rendering to better handle syntax highlighting for different programming languages. Now explicitly supports Python, JavaScript, Bash, JSON, and plain text, with improved fallback to plain text for unsupported languages. Added overflow scrolling for long code blocks and removed keyboard-style formatting for non-language code segments.

JUN 20 MODEL - Added Claude 3.5 Sonnet model with 200K context window

Added support for Anthropic’s Claude 3.5 Sonnet model (claude-3-5-sonnet-20240620) with a 200,000 token context window. The model supports streaming responses and has token costs of

0.000003 per prompt token and

0.000015 per completion token. Two variants are available: the base model name ‘claude-3-5-sonnet’ and the specific version ‘claude-3-5-sonnet-20240620’.

JUN 20 IMPROVEMENT - Added assistant settings synchronization with temperature and token controls

Added support for synchronizing assistant settings with the backend, including customizable instructions, maximum token limits (up to 2000 tokens), and temperature controls (0-1 range, default 0.7). These settings can now be updated and persisted alongside existing weights and policies configurations.

JUN 20 IMPROVEMENT - Enhanced RAG system prompting with more precise instructions and journalistic tone

Updated the RAG (Retrieval-Augmented Generation) system prompting to enforce a more journalistic tone and stricter source adherence. The system now explicitly requires responses to be derived solely from provided sources, with clearer instructions on citation formatting and a stronger emphasis on avoiding speculation or external knowledge.

JUN 20 IMPROVEMENT - Updated empty conversations view with new link-style button

Changed the ‘Start a new thread’ button in the empty conversations view from a standard button to an underlined, purple text link. This UI update provides a more subtle and modern look while maintaining the same functionality to create new chat threads.

JUN 19 IMPROVEMENT - Enhanced mobile-friendly design with improved navigation and user menu

JUN 19 FIX - Fixed spacing in space selection notification message

Corrected the spacing in the notification toast that appears when selecting a space, removing an extra space before the exclamation mark for better text aesthetics.

JUN 19 IMPROVEMENT - Increased parallelism for S3 chunk fetching to improve retrieval performance

Improved RAG document retrieval performance by increasing the maximum concurrent connections for fetching chunks from S3 storage from the default (10) to 60. This enhancement allows the system to download multiple document chunks in parallel more efficiently, reducing latency when retrieving search results. The concurrency level is now configurable via the PULZE_RAG_DATA_DOWNLOAD_MAX_CONCURRENCY environment variable.

JUN 18 NEW - Added search and pagination for apps and conversations

Added search functionality and cursor-based pagination for both apps and conversations lists. Users can now search apps by name and filter to show only owned apps. Conversations are now automatically sorted by last modified time, with a new modified_on column tracking the most recent message in each conversation.

JUN 18 IMPROVEMENT - Show failed processing state for data files and disable interactions with unindexed files

Added visual feedback for failed data file processing, showing a red ‘Processing Failed’ message. Files that are not in ‘INDEXED’ state can no longer be toggled active/inactive, preventing interaction with files that are still processing or have failed. The file activation switch is now only enabled for successfully indexed files.

JUN 18 IMPROVEMENT - Added name input dialog when creating new spaces

Users can now customize the name of their spaces when creating them through a new dialog interface. Previously, spaces were created with default names; now users can enter a custom name before creation. The system automatically creates a space called ‘My First Space’ for new users’ first workspace.

JUN 17 IMPROVEMENT - Added logout option when authentication loading persists

Added a new UI element that provides users with a logout option when experiencing persistent authentication issues. A clickable ‘here’ link now appears at the bottom-right of the loading screen, allowing users to manually logout if authentication gets stuck. Also improved error handling for invite code parsing during the organization joining process.

JUN 17 IMPROVEMENT - Improved organization invite flow with consistent page reload behavior

Enhanced the organization invitation flow by adding success messages for both accepting and declining invites. Now displays ‘Organization joined!’ when accepting an invite and ‘Organization invite declined!’ when declining, followed by an automatic redirect to the dashboard page.

JUN 17 FIX - Fixed model switching behavior and expired token handling

Fixed two issues: 1) Model switching now correctly displays all available models and prevents duplicate entries in the model selection dropdown. 2) Added automatic page reload when authentication tokens expire, preventing session-related errors. The model switcher’s ranking system now only excludes the top 3 ranked models from the full list, making more models accessible.

JUN 17 IMPROVEMENT - Enhanced organization switcher with placeholder logos and improved image handling

Improved the organization switcher UI by adding placeholder images with organization initials when no logo is available. The ImagePreview component now supports dynamic placeholders using organization names, displays logos in a consistent size (16x16), and maintains aspect ratio with proper background containment. Organization logos in the switcher now have a unified styling with slate-colored rings and proper scaling.

JUN 15 IMPROVEMENT - Enhanced response handling when sources don't match query

Added standardized response behavior when source documents are not relevant to the user’s query. The system will now respond with ‘Sorry, I do not have enough information to respond to this query’ instead of attempting to generate a potentially inaccurate response from irrelevant sources.

JUN 15 FIX - Fixed data loss when saving sandbox app changes

Fixed a bug where custom data would be lost when merging sandbox app changes back to the parent app. The code previously attempted to replace custom data during sandbox merging, which could lead to data loss. Now maintains existing custom data integrity during sandbox operations.

JUN 14 IMPROVEMENT - Improved chat interface with centered layout and better message styling

Enhanced the chat interface with a centered layout (max-width 3xl) and improved message styling. Messages now have a cleaner look with a white background, rounded borders, and better spacing. User prompts and AI responses are more visually distinct with consistent padding and improved avatar alignment.

JUN 14 IMPROVEMENT - Enhanced chat interface with full-width layout and refined UI elements

Redesigned the chat interface with a full-width layout and improved message presentation. Messages now span the full width with cleaner spacing, refined message bubbles, and improved visual hierarchy. Additional UI improvements include updated placeholder text from ‘conversation’ to ‘thread’, truncated conversation titles in the header, and a more compact header design.

JUN 14 IMPROVEMENT - Added popover menu for quick space navigation

JUN 13 IMPROVEMENT - Redesigned organization switcher with improved UI and visual feedback

Added a new organization switcher component with enhanced visual design, including organization logos, active state indicators, and smoother transitions. Users can now easily switch between organizations with clear visual feedback, see their current active organization highlighted in purple, and access organization management options directly from the switcher menu.

JUN 13 IMPROVEMENT - Added user agent header to URL ingestion requests

URL ingestion now includes a browser user agent (Chrome 58 on Windows) when fetching web content. This improves compatibility with websites that block or restrict requests without proper user agent headers, reducing failed ingestion attempts for RAG document processing.

JUN 12 NEW - Added Space Settings management and improved navigation

JUN 11 MODEL - Add Qwen2-72B and rename Mistral-7B model versions

Added Qwen2-72B-Chat model (131K context window) with support for streaming, penalties, and multiple outputs. Model costs 0.0009¢ per token for both prompt and completion. Additionally, renamed together/mistral-7b-instruct-v0.3 to together/mistral-7b-instruct, and together/mistral-7b-instruct to together/mistral-7b-instruct-v0.2 for better version clarity.

JUN 11 FIX - Fixed organization switching and authentication redirect URLs

Fixed incorrect redirect paths when switching organizations or authenticating by updating URLs from ‘/spaces’ to ‘/s’. This ensures users are properly redirected to the correct spaces dashboard URL after organization switches and authentication events.

JUN 11 NEW - Added organization management features with new UI components

Added new organization management functionality with user interface components including member removal and refresh capabilities. Introduced new icons (person-remove.svg and refresh.svg) and placeholders for organization imagery, along with badge components for improved user management interface. The changes include Zod validation library integration (v3.23.8) for enhanced data validation.

JUN 10 NEW - Added granular App permissions and membership management

Introduced a new app-level permissions system with specific roles (viewer, editor, admin) and membership management. Users who create apps are now automatically assigned as app admins, and permissions are enforced at both organization and app levels. This enables more fine-grained access control for app operations like key regeneration, custom data management, and model configuration.

JUN 10 IMPROVEMENT - Opened public access to Spaces by removing waitlist requirement

Removed the waitlist requirement for signing up to Spaces in production, allowing immediate access for all new users. Also updated authentication infrastructure to use auth.pulze.ai domain for enhanced security and branding consistency.

JUN 09 NEW - Add system instructions support for Google Chat models (Gemini 1.0 Pro and newer)

Added support for system instructions in Google Chat models, allowing users to set context and behavior instructions at the conversation level. System instructions are now properly handled for all Gemini models except gemini-1.0-pro-001, where the system message is automatically prepended to the first user message as a workaround. This brings Google Chat models in line with other providers’ system instruction capabilities.

JUN 07 NEW - Added source references for AI responses with document previews

AI responses now display source documents that were used to generate the answer. Sources appear below each response with document paths and file previews in a grid layout. Users can see which files were referenced, with each source showing a file icon and truncated path name.

JUN 06 NEW - Removed flowz feature and related functionality

JUN 06 IMPROVEMENT - Enhanced payment method listing to support all payment types

Expanded payment method listing functionality to retrieve all available payment types for an organization, removing the previous restriction that limited results to card payments only. This change allows organizations to view and manage a wider range of payment methods in their billing settings.

JUN 06 NEW - Added split-screen message comparison view in chat interface

Added ability to compare different model responses side-by-side in a split-screen view. Users can now select between multiple model responses for the same prompt, visually compare them in a two-column layout, and easily switch between different model versions using a new model switcher component. The feature includes visual indicators for selected messages and the ability to trigger new comparisons directly from the chat interface.

JUN 06 IMPROVEMENT - Updated branding to Pulze Spaces with new favicon

Rebranded application to ‘Pulze Spaces’ with a new orange circular favicon replacing the default Vite logo. Also improved UI alignment in the Spaces view by adding consistent vertical centering for avatars, names, and action buttons.

JUN 06 IMPROVEMENT - UI improvements to sidebar navigation and chat interface

JUN 06 IMPROVEMENT - Renamed 'Retrieved Files' to 'Sources' and fixed button display issue

The ‘Retrieved Files’ button and panel have been renamed to ‘Sources’ throughout the sandbox interface for better clarity. Additionally, fixed a bug where the Sources button would appear even when no documents were retrieved (RAG not triggered), now correctly hiding when the documents array is empty.

JUN 05 FIX - Improved model score sorting for more consistent results

Enhanced the model scoring system to use more comprehensive sorting criteria. When multiple models have the same overall score, they are now consistently ordered by additional factors: Quality, Latency, and Cost (in that priority). This provides more stable and predictable model rankings in the API response.

JUN 05 IMPROVEMENT - Renamed 'Notebooks' to 'Spaces' and improved dialog UI

Refactored the notebook renaming functionality into a more general ‘Space’ concept, with improved dialog UI components. The rename dialog now uses a standardized Dialog component and includes better focus management for the input field. This change represents a terminology shift from ‘notebooks’ to ‘spaces’ throughout the application.

JUN 05 IMPROVEMENT - Renamed Notebooks to Spaces with improved UI navigation

JUN 04 FIX - Increased default max tokens from 200 to 2000

Fixed an overly restrictive default token limit that was set to 200 tokens. The default maximum token limit has been increased to 2000 tokens, allowing for longer message responses in chat conversations. This change also includes refactoring of chat history handling to improve message organization.

JUN 04 FIX - Added error feedback for chat messages and improved file upload interaction

Added error toast notifications when chat messages fail to send, providing users with immediate feedback about communication issues. Also improved the file upload area by making it clickable, allowing users to trigger the file selector by clicking anywhere in the drop zone in addition to drag-and-drop functionality.

JUN 04 IMPROVEMENT - Improved RAG document chunking with 2-phase splitting mechanism

Enhanced document retrieval accuracy by implementing a two-phase chunking strategy. Documents are now first split into larger context windows (768 tokens with 128 token overlap), then subdivided into smaller embedding chunks (128 tokens with 32 token overlap). This replaces the previous single-phase approach that used arbitrary merging, resulting in better semantic understanding while maintaining context during retrieval operations.

JUN 04 IMPROVEMENT - Enhanced reliability with automatic retry logic for Qdrant and Tika failures

Added automatic retry mechanisms for Qdrant vector store operations (add, delete, retrieve) and Tika PDF text extraction service. Operations now retry with exponential backoff (up to 15 seconds between attempts) for up to 60 seconds when encountering connection or response handling failures. Tika service calls now properly validate response status codes and raise errors on failures (status >= 400), triggering the existing retry logic.

JUN 04 FIX - Fixed PDF parsing to correctly split text into nodes for RAG indexing

Fixed an issue where PDF content extracted from Tika was not being properly parsed into structured nodes before indexing. The fix introduces HTML-based chunking using partition_html with ‘by_title’ strategy, replaces default sub-sentence splitting functions to avoid regex-based splitting that was causing issues, and adds validation to ensure start/end character indices are present for accurate byte range calculations. This resolves problems with PDF text extraction and improves the reliability of document retrieval from S3.

JUN 03 IMPROVEMENT - Enhanced chat UI with improved model switcher and message actions

Improved the chat interface with more polished UI elements. Message actions (copy and regenerate) are now displayed as icon buttons with tooltips, the model switcher has an improved visual design with a white background, and search inputs across the application now support custom placeholders. Tooltips throughout the application now feature rounded corners for a more modern look.

JUN 03 IMPROVEMENT - Improved chat history grouping to use rolling 7-day window

Updated conversation history grouping to use a rolling 7-day window instead of the calendar week. Conversations are now grouped into ‘Today’, ‘Yesterday’, and ‘Last 7 Days’ categories, providing a more intuitive timeline view of chat history.

JUN 03 TECHNICAL - Add URL validation and file handling dependencies

Added new dependencies for URL validation (url-regex-safe) and added new file and link-related icons (FileBlankIcon, LinkIcon) to the icon library. This change appears to be foundational work for upcoming features related to file handling and URL processing in notebooks.

JUN 03 IMPROVEMENT - Chat conversations now automatically titled using first words of the prompt

Chat conversations are now automatically named using the first few words of the initial prompt (up to 50 characters), instead of being titled ‘Untitled Chat’. This makes it easier to identify and locate specific conversations in your chat history. The title is intelligently truncated to avoid breaking words in the middle.

JUN 01 NEW - Add support for source citations in model responses via feature flag

Added ability to enable source citations in RAG (Retrieval-Augmented Generation) responses through a new ‘citations’ feature flag. When enabled, model responses will automatically include citation references (e.g., [[citation:1]]) at the end of each sentence to indicate which source documents were used. Citations are formatted consistently and include both single and multiple source references.

May 2024

MAY 31 IMPROVEMENT - Enhanced RAG with improved document retrieval and citation formatting

Improved RAG (Retrieval Augmented Generation) system with a new document citation format that provides clearer source attribution. Documents are now tagged with sequential citation numbers (e.g., [[citation:1]]), and responses include specific citations for each statement. The system also includes improved prompt instructions for more natural, concise responses while maintaining accuracy and comprehensive source references.

MAY 31 NEW - Added persistent conversations with chat history support

Introduced a new conversations feature that allows users to maintain persistent chat histories within apps. Users can now create named conversations, view conversation history, and link related requests together using parent-child relationships. The update includes new API endpoints for creating, updating, deleting, and retrieving conversations, plus UI support for managing conversation threads.

MAY 31 IMPROVEMENT - Renamed 'conversation' to 'chat' throughout the user interface

Updated terminology in the UI from ‘conversation’ to ‘chat’ for better consistency and clarity. Changes include renaming the ‘Recent Conversations’ header to ‘Recent Chats’ and updating the dialog title from ‘Rename conversation’ to ‘Rename chat’.

MAY 31 IMPROVEMENT - Renamed 'Conversations' to 'Chat History' in UI

Updated navigation labels and section titles to use ‘Chat History’ instead of ‘Conversations’ for better clarity. Also standardized capitalization of ‘Customize Assistant’ in the interface. These changes make the UI terminology more intuitive and consistent.

MAY 31 IMPROVEMENT - Increased default max tokens from 200 to 2000 in chat interface

Increased the default maximum token limit for chat responses from 200 to 2000 tokens, allowing for significantly longer model responses without manual adjustment. This change provides more comprehensive responses while maintaining the same temperature setting of 0.7.

MAY 31 IMPROVEMENT - Improved model switcher visibility in chat interface

Fixed a z-index issue with the model switcher dropdown menu that could be hidden behind other UI elements. The model selection popup now consistently appears above other page content for better usability.

MAY 28 MODEL - Added Gemini 1.0/1.5 models and updated pricing

Added new Google models including Gemini-1.0-Pro-001 (32K context), Gemini-1.5-Pro-001 (1M context), and Gemini-1.5-Flash-001 (1M context). Deprecating Gemini-1.5-Flash-Preview and Gemini-1.5-Pro-Preview (June 24, 2024) and setting end dates for Gemini-1.5-Flash-001 and Gemini-1.5-Pro-001 (May 24, 2025). Updated pricing for Gemini Pro models and added aliases ‘gemini-1.5-pro’ and ‘gemini-1.5-flash’ for latest stable versions.

MAY 25 MODEL - Add Mistral 7B Instruct v0.3 model via Together AI

Added Mistral-7B-Instruct-v0.3 model through Together AI provider with a 32K context window. The model supports chat, streaming, penalties, and multiple outputs (n), with competitive pricing at $0.0002 per thousand tokens. This instruct-tuned version of Mistral-7B-v0.3 is initially set as inactive by default.

MAY 24 IMPROVEMENT - Enhanced RAG response quality and consistency with improved prompt instructions

Improved RAG (Retrieval Augmented Generation) response quality by refining the prompt structure and instruction clarity. The system now provides more natural, concise answers without referencing the context explicitly, and includes clearer guidelines for the AI to process information step-by-step while maintaining response accuracy.

MAY 24 IMPROVEMENT - Improved chat UI with better scrolling behavior and user name display

Enhanced the chat interface with smarter auto-scrolling that responds to scroll direction and preserves user scroll position when reading history. Improved user name display in the sidebar to handle long names with truncation. Added edit icon assets and renamed notebook dialog functionality for better UX.

MAY 23 FIX - Fixed terms/privacy acceptance tracking and test model visibility in debug mode

Added support for tracking when users accept terms and privacy policies by storing acceptance timestamps from Auth0. Also improved test model visibility logic to show test models only when debug mode is enabled, rather than based on environment. These changes ensure better user consent tracking and clearer test model handling.

MAY 23 NEW - Added notebook renaming capability and toast notifications

Added ability to rename notebooks through a new dialog interface, along with react-hot-toast notifications for user feedback. The dialog includes a text input field with auto-focus and integrated API updates to persist name changes.

MAY 22 IMPROVEMENT - Removed user signup restrictions in development environment

MAY 22 NEW - Added notebook renaming functionality

Users can now rename notebooks directly from the chat interface. A new rename option appears in the notebook header menu, opening a dialog where you can enter a new name. The change is immediately reflected across the interface and in the notebooks list.

MAY 22 IMPROVEMENT - Made notebooks page scrollable to view all content

The notebooks page now supports vertical scrolling, allowing users to view all notebook content even when it extends beyond the visible area. Previously, content that extended below the viewport was inaccessible, but now users can scroll through their entire notebooks list and content.

MAY 22 IMPROVEMENT - Improved sidebar navigation and loading experience

MAY 22 IMPROVEMENT - Simplified navigation by removing unused sidebar menu items and chat header buttons

MAY 22 NEW - Added toast notifications for notebook actions

Implemented toast notifications using react-hot-toast library to provide user feedback when creating or deleting notebooks. When a notebook is created, a success message displays “Using [notebook name]!” and when deleted, shows “[notebook name] removed!”. Notifications appear in the bottom-right corner with customized durations: 2 seconds for success messages and 5 seconds for errors.

MAY 21 IMPROVEMENT - Updated browser tab title from 'Vite + React + TS' to 'Frontend v2'

The browser tab title has been updated to display ‘Frontend v2’ instead of the default ‘Vite + React + TS’ template text. This provides better branding and makes it easier to identify the application when multiple tabs are open in your browser.

MAY 21 FIX - Fixed border visibility for outlined buttons

Fixed an issue where outlined buttons would not consistently display their borders due to CSS specificity conflicts. The border styling now uses the !important flag to ensure outlined buttons always show their border when not selected, improving visual consistency across the interface.

MAY 21 IMPROVEMENT - Enhanced user menu with dropdown in sidebar

MAY 21 NEW - Added notebook deletion functionality with dropdown menu

MAY 21 NEW - Added functional 'New notebook' button to create and navigate to notebooks

The ‘New notebook’ button on the Notebooks page is now fully functional. Clicking the button creates a new notebook via the /apps/create API endpoint and automatically navigates you to the chat interface for that notebook. The button also includes enhanced visual states with hover, focus, and active styling for better user feedback.

MAY 21 IMPROVEMENT - Disabled manual resizing of chat input textarea

The chat input box can no longer be manually resized by dragging the corner. The textarea now maintains its automatic height adjustment based on content while preventing user-initiated resizing, creating a more consistent and streamlined chat interface.

MAY 21 IMPROVEMENT - Chat input box now automatically focuses on page load

The chat input box now automatically receives focus when you open the chat page, allowing you to start typing immediately without clicking into the text field first. Additionally, removed the placeholder message ‘How can I help you today?’ that previously displayed when no messages were present.

MAY 21 IMPROVEMENT - Improved visual alignment of chat message layout

Adjusted the spacing and padding in the chat message interface for better visual consistency. User avatars and message content now align more precisely, with profile pictures and AI icons positioned using 3px padding (reduced from 4px), and message text using 3px left padding instead of the previous 16px margin-left layout. The vertical spacing between user name and message content has also been increased from 2px to 4px for improved readability.

MAY 21 IMPROVEMENT - Chat textarea now auto-expands as you type longer messages

The chat input box now automatically grows in height as you type multi-line messages, up to a maximum of 50% of the viewport height. This makes it easier to compose and review longer prompts without manually resizing the text area or scrolling within a fixed-size input box.

MAY 21 FIX - Fixed Send button padding in chat interface

Adjusted the padding of the Send button in the chat box to improve visual alignment and spacing. The button now has consistent horizontal padding (px-3) for better appearance. Also disabled mock store functionality to ensure real data is used in the chat interface.

MAY 21 IMPROVEMENT - Updated model switcher icon and styling

Improved the visual design of the model switcher in chat by replacing the checkmark icon with a more prominent check-circle icon and changing the highlight color from blue to purple for better visibility of the currently selected model.

MAY 21 IMPROVEMENT - Added animated loading screen with Pulze logo

Replaced the placeholder “Page Loader” text with a proper loading screen component featuring an animated Pulze logo. The loader now displays a centered, pulsing Pulze logo (100px) while the application authenticates and initializes, providing better visual feedback during page load.

MAY 21 FIX - Fixed disabled button text color and improved authentication flow

Fixed an issue where disabled button text color could be overridden by other styles, ensuring disabled buttons always display the correct muted text appearance. Also improved the authentication experience by showing a ‘Redirecting to login’ message when users are not authenticated, and updated the model switcher label from ‘recommended for your prompt’ to ‘recommended for your message’ for clarity.

MAY 21 NEW - Added call-to-action button support in page headers

Page headers can now display action buttons on the right side. The Notebooks page now features a prominent purple “New notebook” button in the header, making it easier to create notebooks. The Button component also supports a new primary type with purple styling (#986BFF) and enhanced outlined button states with selected styling.

MAY 21 IMPROVEMENT - Added faded appearance option to sidebar collapse/expand buttons

The sidebar collapse and expand buttons now have a faded appearance with 60% opacity, making them visually less prominent. This enhancement to the Button component includes a new ‘faded’ prop that reduces the visual weight of buttons when needed. Additionally, navigation menu item spacing has been increased with more vertical padding for improved readability.

MAY 21 IMPROVEMENT - Enhanced chat interface with improved button styling and send functionality

Redesigned the chat input box with a dedicated ‘Send’ button that becomes active when text is entered, replacing the keyboard-only submit method. Updated button component styling with new ‘fill’ appearance (black background with white text), improved disabled state styling (gray background and text), and refined ghost button styles. The chat textarea now includes an enhanced placeholder text prompting users to ‘Ask a question, start a conversation, or type / for commands’.

MAY 21 IMPROVEMENT - Added collapsible sidebar navigation

MAY 20 MODEL - Added Gemini 1.5 models and updated Google model deprecation schedule

Added two new Google models: Gemini 1.5 Pro Preview (1M context) for complex reasoning tasks and Gemini 1.5 Flash Preview (1M context) for high-speed operations. Also scheduled deprecations for several Google models: Chat-bison and Text-bison@002 models (Oct 9, 2024), Text-unicorn@001 (Nov 30, 2024), and Gemini 1.5 Pro Preview 0409 (June 14, 2024). Both new models support streaming and chat functionality.

MAY 20 FIX - Fixed Chat button navigation issue in header

MAY 20 IMPROVEMENT - Updated sidebar collapse icon color for better visual consistency

The sidebar collapse icon now uses a muted ghost color (text-ghost-muted) instead of the default color. This improves visual hierarchy and consistency with the overall sidebar design, making the interface more polished and easier to scan.

MAY 20 IMPROVEMENT - Enhanced navigation menu styling and notebook navigation

MAY 20 IMPROVEMENT - Restructured application routing and added authentication layer

Overhauled the application’s routing system with new layout structure, moving from a single Layout component to a Root component with nested routing. Added comprehensive authentication context using Auth0 that automatically handles login redirects, token management, and API instance creation with authorization headers. Updated the application logo with a larger, redesigned version featuring a new color scheme (dark green background #003C00 with cream text #FFF5DC).

MAY 17 IMPROVEMENT - Display model snapshot dates with @ symbol in model names

Model names now display with their snapshot dates using the @ symbol format (e.g., gpt-4@2024-05-13). The ModelDisplay component has been enhanced to include an optional ‘at’ field that appends the snapshot date to the model name when available, providing clearer model version information throughout the interface.

MAY 17 IMPROVEMENT - Added search functionality to model switcher for easier model selection

The model switcher now includes a search box that allows you to filter models by typing. The search matches against model names and providers (e.g., ‘OpenAI gpt-4’ becomes searchable as ‘openaigpt4’). The dropdown also now displays all available models from both base and custom model settings, sorted alphabetically, making it easier to find and switch between models during conversations.

MAY 17 NEW - Added automatic scrolling and scroll-to-bottom button in chat

Chat messages now automatically scroll to show the latest content as new messages arrive. When you manually scroll up to view older messages, auto-scrolling pauses and a scroll-to-bottom button appears, allowing you to quickly jump back to the most recent messages. The system intelligently detects when you’re near the bottom (within 40 pixels) to resume auto-scrolling.

MAY 17 FIX - Fixed label typo in Assistant panel from 'Mode' to 'Model'

Corrected a typographical error in the Assistant side panel where the label above the model selection dropdown was incorrectly displayed as ‘Mode’ instead of ‘Model’. This fix ensures the UI accurately reflects that users are selecting an AI model, improving clarity and preventing confusion.

MAY 17 NEW - Added model selection with automatic failover configuration

Implemented dynamic model selection in the Assistant panel that automatically configures failover models. When switching from SMART mode to a specific model, the system now updates the app’s failover chain configuration via API. The selected model persists and enables single-model mode, while selecting SMART mode disables the failover chain.

MAY 17 NEW - Added creativity (temperature) and max tokens configuration sliders

Added two new configuration sliders in the chat assistant panel: a Creativity slider (controlling temperature from 0 to 1 in 0.1 increments) and a Max Tokens slider (allowing values up to 32,750 tokens in 50-token steps). Both sliders include visual input fields with improved styling and support for decimal values, giving users fine-grained control over AI response generation parameters.

MAY 17 IMPROVEMENT - Added breadcrumb navigation to chat header with notebook name display

MAY 17 NEW - Added clear conversation button to chat interface

Added a new button with a broom icon in the top-right corner of the chat area that allows users to clear the entire conversation with one click. The button appears muted when there are no messages and becomes active when messages are present. Also improved the model selection display by consistently showing model names with their providers throughout the assistant settings panel.

MAY 16 IMPROVEMENT - Increased document ingestion time limit from 30 to 60 minutes

Extended the maximum processing time for document ingestion tasks from 30 minutes (1,800,000ms) to 60 minutes (3,600,000ms). This allows larger documents and data sources to be fully processed without timing out, improving reliability for ingesting substantial content into the RAG system.

MAY 16 IMPROVEMENT - Increased document ingestion timeout and worker capacity

Increased the maximum time limit for document ingestion operations from the default to 30 minutes (1,800,000 milliseconds), allowing larger documents and datasets to be processed without timing out. Also increased production RAG worker replicas from 3 to 5 to improve ingestion throughput and handle more concurrent document processing tasks.

MAY 16 NEW - Added SMART model selection mode in assistant settings

Added a new ‘SMART’ model option alongside existing provider models in the assistant configuration panel. Users can now select between SMART mode (intelligent automatic model routing) and manual model selection from available providers. The model selector displays provider logos and model names in a dropdown menu with improved visual presentation.

MAY 16 FIX - Fixed mode edit button to only be enabled for smart mode

Fixed an issue where the mode edit button in the Assistant panel was always enabled regardless of the selected mode. The edit button is now properly disabled when manual mode is selected, preventing users from attempting to edit settings that are only applicable to smart mode. Also improved button styling to show a visual disabled state with reduced opacity.

MAY 16 NEW - Added customizable range sliders for Smart Mode parameter tuning

Introduced interactive range sliders in the Smart Mode settings panel, allowing users to fine-tune Quality and Speed parameters with precise control. The sliders support custom min/max values, step increments (0.05 intervals), optional value display, and inverted ranges. Users can now adjust model routing weights through an intuitive visual interface with real-time feedback and step markers.

MAY 16 IMPROVEMENT - Improved assistant configuration with custom instructions and optimistic UI updates

Added the ability to set custom instructions for the AI assistant (e.g., ‘You’re a helpful assistant’) through a new text area in the assistant panel. Implemented optimistic UI updates for app configuration changes, which means changes to assistant settings now appear instantly before server confirmation, providing a more responsive user experience. Also added React Query DevTools for better debugging capabilities.

MAY 16 IMPROVEMENT - Added new icons for smart mode and cost/performance indicators

Added three new icons to the interface: a dollar-circle icon for cost indicators, a speedometer icon for performance/speed metrics, and a stars icon likely for quality or smart features. These icons are now available for use throughout the application, particularly for smart mode editing features and cost/performance visualization.

MAY 16 IMPROVEMENT - Added automatic redirect from /auth to /chat after authentication

Implemented automatic redirection for users who land on the /auth route after completing authentication. Users are now seamlessly redirected to the /chat page instead of staying on the auth endpoint, providing a smoother post-login experience.

MAY 16 NEW - Added Assistant configuration panel with collapsible sidebar

Introduced a new Assistant panel that appears as a 400px sidebar on the right side of the chat interface. Users can now toggle the Assistant panel open and closed using a close button in the panel header. The panel includes a dedicated header with the title ‘Assistant’ and improved layout structure for future assistant configuration options.

MAY 16 NEW - Add Markdown rendering support with syntax highlighting

Added markdown rendering capabilities to the application using react-markdown (v9.0.1) and highlight.js (v11.9.0). Users can now view formatted markdown content with syntax-highlighted code blocks, enabling better display of documentation, comments, and technical content throughout the interface.

MAY 16 NEW - Added copy and reload buttons to chat messages

Added new copy and reload action buttons to chat messages with two new icons (CopyIcon and RepeatIcon). Users can now easily copy message content to clipboard or reload/regenerate responses directly from the message interface. The buttons use a new outlined appearance style with customizable sizes (xs, sm, md, lg).

MAY 15 NEW - Added AI model switcher with provider selection and recommended models

Introduced a new model switcher interface that allows users to change AI models during conversations. The switcher displays recommended models based on the prompt and shows provider logos (via providerLogo function) alongside model names. Users can now easily switch between different AI providers and models mid-conversation, with the currently selected model indicated by a checkmark. The feature includes search functionality and visual indicators for model selection.

MAY 15 FIX - Fixed cursor pointer behavior in model switcher dropdown

MAY 15 IMPROVEMENT - Added model switcher UI component with dropdown interface

MAY 15 NEW - Added conversation functionality with real-time streaming support

Implemented core conversation feature with support for real-time message streaming using Server-Sent Events. Added state management with Zustand and Immer for conversation handling, TanStack Query for data fetching, and Axios for HTTP requests. This enables users to have interactive conversations with streaming responses from the backend API (configured to connect to http://localhost:8080/v1).

MAY 13 FIX - Fixed model benchmark scores to use proper 0-1.0 scale instead of 0-10 scale

Corrected the scoring scale for all model benchmark scores to use decimal values between 0-1.0 instead of 0-10. This affects 29 models including GPT-4, Claude 3, and various Mistral/Llama models. For example, GPT-4’s score was adjusted from 8.8 to 0.88, and Claude-3-opus from 8.1 to 0.81.

MAY 13 MODEL - Added GPT-4o, Qwen 1.5 110B, and Hermes 2 Pro Llama 3 models

Added three new language models: GPT-4o (with function calling support), Qwen 1.5 110B Chat (32K context window) via Together AI, and Hermes-2-Pro-Llama-3-8B (32K context window) via OctoAI. The Hermes model is an open-source Llama 3 fine-tune optimized for conversational and reasoning tasks, while Qwen 1.5 110B is a large-scale decoder-only transformer model.

MAY 13 NEW - Added chat interface with authentication and improved navigation

MAY 13 NEW - Added navigation header with Chat button and icon controls

MAY 10 IMPROVEMENT - Show copy code button on markdown code blocks

Code blocks rendered in markdown content now display a copy button, making it easier to copy code snippets to your clipboard. Previously, the copy button was only available on standalone code blocks, but now appears on all code blocks within markdown content as well.

MAY 02 MODEL - Updated pricing for Groq and OctoAI models, with upcoming deprecations

Updated token pricing for OctoAI models across different sizes: 7B/8B models (

0.15), 13B models (

0.20), Mixtral-8x7B (

0.45), 32B/34B models (

0.75), 70B models (

0.90), and Mixtral-8x22B (

1.20). Several models will be deprecated on May 13, 2024: OctoAI’s CodeLlama (7B, 13B, 34B) and Llama 2 (13B, 70B). Additionally, removed the Groq/Llama-2-70B-chat model.

MAY 01 MODEL - Model deprecation and pricing updates for Groq and OctoAI models

Deprecated Groq’s LLaMA-2-70B-Chat model. Updated token pricing for OctoAI models, with all 7B and 8B models (including Mistral, Code Llama, Llama 2, Llama Guard, and Llama 3) now priced at $0.15 per million tokens for both prompt and completion.

MAY 01 MODEL - Remove Groq's LLaMA-2-70B Chat model

Removed Groq’s LLaMA-2-70B Chat model (groq/llama-2-70b-chat) and its associated settings from the available model options. Users who were using this model will need to switch to an alternative model.

April 2024

APR 29 MODEL - Added OctoAI Llama 3 models (8B and 70B)

Added two new Llama 3 models from OctoAI: llama-3-8b-instruct (8K context) and llama-3-70b-instruct (8K context). Both models support streaming, JSON output, and custom sampling parameters. The 70B model is set as default active and offers higher performance at a higher cost (0.6/1.9µ¢ per token vs 0.1/0.25µ¢ for 8B). These models are instruction-tuned for dialogue and reportedly outperform many existing open-source chat models.

APR 29 IMPROVEMENT - Increased default top-K results for model score hits from 5 to 10

The default number of top similarity search results returned when querying model scores has been increased from 5 to 10, providing more comprehensive scoring information. This change is now configurable via a new —top-k command-line flag, allowing users to customize the number of results based on their needs.

APR 25 IMPROVEMENT - Improved Learning Hub layout with two-column design for videos and steps

The Learning Hub page now uses a two-column layout that displays tutorial videos alongside their corresponding step-by-step instructions. This layout improvement makes it easier to follow along with training courses on topics like Prompt Engineering, Creative Writing, and Model Comparison by allowing users to view both the video and instructions simultaneously without scrolling.

APR 25 FIX - Fixed file drop area background color not displaying in production builds

Fixed an issue where the file drop area’s background color would not render correctly when the application was built for production. The background color now properly displays as a semi-transparent pulze-300 color when not dragging, and switches to a blue background when dragging files over the drop zone.

APR 25 IMPROVEMENT - Updated file drop area to use blue color scheme

Changed the file drop area visual styling from purple to blue colors for better consistency. When dragging files over the drop zone, the border and background now display in blue (blue-v2-500) instead of purple, and the border width has been standardized from 2px to 1px throughout the component for a cleaner appearance.

APR 24 NEW - Added Learning Hub with prompt engineering video tutorials

Introduced a new Learning Hub accessible from the Tools section in the sidebar, featuring educational video content on prompt engineering. The hub includes beginner-level tutorials covering prompt engineering introduction, creative writing for short stories, poetry generation, and additional training materials with embedded Loom video lessons in an expandable accordion interface.

APR 23 MODEL - Added OctoAI Mixtral-8x22B Instruct and updated existing OctoAI model specs

Added OctoAI’s Mixtral-8x22B Instruct model with a 64K context window, supporting chat, streaming, JSON output, and penalties. Updated specifications for existing OctoAI models: Mistral-7B (32K context), Mixtral-8x7B, CodeLlama-7B/13B (16K context) with revised token pricing. The new Mixtral-8x22B model excels in multiple languages including English, French, Italian, German, and Spanish, with strong mathematics and coding capabilities.

APR 23 IMPROVEMENT - Sandbox apps now always inherit custom data from parent apps

Modified sandbox application creation to always inherit custom data from parent apps by setting use_parent_custom_data to true. This ensures consistent data handling between original apps and their sandbox versions, improving testing and development workflows.

APR 23 FIX - Fixed file drag-and-drop upload area and modal close callback handling

APR 22 IMPROVEMENT - Removed legacy NVIDIA Tesla T4 GPU node pools from infrastructure

Removed the legacy GPU node pool configuration that used NVIDIA Tesla T4 accelerators in favor of the newer L4 GPU instances. This cleanup affects both development and production environments, removing T4 nodes that were previously deployed in us-west1-a and us-west1-b with local SSD storage. Users should now utilize the gpu-l4 node pools with g2-standard-4 machines for GPU-accelerated workloads.

APR 22 NEW - Added drag-and-drop file upload with progress tracking in sandbox

The sandbox datasources section now supports drag-and-drop file uploads with real-time upload percentage display. Files can be dragged directly into the sandbox area, with invalid file types automatically rejected and filtered out. A visual drop zone indicator appears during drag operations to guide users.

APR 22 NEW - Add drag-and-drop file upload with visual drop zone in custom data

You can now upload files to custom data by dragging and dropping them directly onto the files list. A visual drop zone with a purple dashed border appears when dragging files over the area, making it clear where to drop your files. This complements the existing file upload functionality with a more intuitive interface.

APR 21 MODEL - Add Llama-3-70B-Instruct via Together AI and Groq providers

Added Llama-3-70B-Instruct model through Together AI and Groq providers to the default active model list. This expands the availability of one of Meta’s largest and most capable instruction-tuned models across multiple providers.

APR 21 MODEL - Added Groq's Llama 3 8B and 70B models

Added support for two new Llama 3 models from Groq: llama-3-8b-instruct and llama-3-70b-instruct. Both models feature an 8,192 token context window, support for streaming and chat completions. The 8B model costs

0.05/million prompt tokens and

0.1/million completion tokens, while the 70B model costs

0.59/million prompt tokens and

0.79/million completion tokens.

APR 19 IMPROVEMENT - Renamed 'Apps' to 'Projects' throughout the user interface

Updated terminology in the application selector component from ‘Apps’ to ‘Projects’ for better clarity. The label in the log filters and component tests now displays ‘Projects’ instead of ‘Apps’, making the interface more intuitive and aligned with standard project management terminology.

APR 19 IMPROVEMENT - Improved custom data upload with individual file progress tracking

Enhanced the custom data upload interface with individual progress tracking for each file being uploaded. Files and URLs are now uploaded in parallel instead of as a single batch, with separate progress indicators and success/error handling for each item. Added file preview functionality and improved error messages to show which specific uploads succeeded or failed.

APR 18 NEW - Enhanced custom data management for apps with improved sandbox capabilities

Introduced new features for managing custom data in apps, including the ability for sandbox apps to use parent app’s custom data through the ‘use_parent_custom_data’ setting. Added new endpoints for listing, updating, and batch deleting custom data files, with improved file metadata management and search capabilities. Custom data files now have an ‘active’ status for better state management.

APR 18 MODEL - Added Llama 3 models (8B and 70B) via Together AI

Added support for two new Llama 3 instruction-tuned models through Together AI: llama-3-8b-instruct (8K context) and llama-3-70b-instruct (8K context). Both models support streaming, multiple outputs (n), and custom penalties. Pricing is set at

0.0002 per 1K tokens for 8B and

0.0009 per 1K tokens for 70B.

APR 17 MODEL - Removed Mixtral-8x22B Instruct (Together) from default active models

The Mixtral-8x22B Instruct model from Together AI provider has been removed from the default active models list. This change affects model availability in the default configuration, though the model may still be accessible if explicitly enabled.

APR 17 MODEL - Added Mixtral-8x22B and WizardLM-2 8x22B models via Together.ai

Added two new large language models: Mixtral-8x22B Instruct (65K context) and WizardLM-2 8x22B (65K context) through Together.ai. Mixtral-8x22B Instruct is enabled by default while WizardLM-2 is available as an optional model. Both models support streaming and penalties, with token pricing at $0.0012 per 1K tokens. Additionally, increased the OctoAI Mixtral-8x22B fine-tuned model’s context window to 65K tokens.

APR 17 MODEL - Add Gemini 1.5 Pro Preview and update Gemini model lineup

Added Gemini 1.5 Pro Preview (1M token context window) and Gemini 1.0 Pro-002 models. Updated pricing for all Gemini models with text-only capabilities costing

0.000000125/token for input and

0.000000375/token for output. Gemini 1.5 Pro Preview features higher pricing (

0.0000025/token input,

0.0000075/token output) and supports multimodal inputs including image, audio, video, and PDF files.

APR 17 IMPROVEMENT - Improved RAG engine retrieval speed with cross-encoder reranking

Significantly enhanced the speed and accuracy of document retrieval in the RAG engine by adding a cross-encoder reranking service (BAAI/bge-reranker-base model) running on GPU infrastructure, and implementing Redis caching (1GB Standard HA instance with LRU eviction policy) to store frequently accessed results. This optimization reduces latency when retrieving relevant documents for AI-powered responses.

APR 17 FIX - Fixed compare mode to send prompts to only 3 models at a time

Fixed an issue in compare mode where prompts were being sent to all available models simultaneously. Now limits concurrent model requests to the top 3 ranked models to prevent performance issues and improve response handling. This change applies to both standard and alternative response loading paths in the sandbox.

APR 16 MODEL - Updated default active model list with latest Claude 3, Mistral, and other models

Updated the default active model list to include Claude 3 series (Haiku, Opus, Sonnet), Mistral’s new models (Small, Medium, Large), and latest versions of GPT-4 Turbo (2024-04-09) and GPT-3.5 Turbo (0125). Also includes Cohere Command-R/R+, Mixtral-8x7b-instruct variants from Groq, OctoAI, and Together.ai, plus Together’s DBRX-instruct model.

APR 12 MODEL - Added new OctoAI models and updated existing model specifications

Added three new OctoAI models: Hermes 2 Pro Mistral 7B (32K context), Qwen 1.5 32B Chat (32K context), and Mixtral 8x22B Finetuned (32K context). Updated specifications for existing OctoAI models: Mistral-7B-Instruct and Mixtral-8x7B-Instruct received token cost updates, while CodeLlama models’ context windows were set to 16K. All new models support streaming, JSON mode, and custom sampling parameters.

APR 11 MODEL - Add GPT-4 Turbo and GPT-4 Turbo Vision models with 128K context window

Added support for OpenAI’s latest GPT-4 Turbo models: ‘gpt-4-turbo’ and ‘gpt-4-turbo-2024-04-09’. Both models feature a 128,000 token context window and support for vision capabilities, function calling, streaming, JSON mode, and custom sampling parameters. The models are priced at

0.01/1K tokens for prompts and

0.03/1K tokens for completions.

APR 11 IMPROVEMENT - Return all active model scores in playground model ranking

Enhanced playground model ranking to return scores for all available whitelisted models instead of being limited to just 3 models. The scoring system now uses percentile normalization for cost, quality, and latency metrics, providing more comprehensive model comparison data. This allows users to see scoring and ranking information for all applicable models when making model selection decisions.

APR 11 FIX - Fixed RAG worker crashes due to out-of-memory errors

Resolved an issue where the RAG worker would crash from out-of-memory (OOMKill) errors during vector store operations. Reduced the batch size from 32 to 16 when processing documents to lower memory consumption and prevent worker crashes.

APR 11 FIX - Fixed sandbox initialization and model score display issues

Fixed sandbox chat not initializing with the correct failover and weight settings from the app configuration. Also corrected the model score display logic so that quality ratings now show ‘N/A’ with grayed-out styling when models haven’t been scored yet, and score badges only appear when the score is non-zero, preventing misleading score information from being displayed.

APR 11 FIX - Fixed sandbox not refreshing when switching between different sandboxes

Fixed an issue where the sandbox environment would not properly refresh when switching between different sandbox applications. The sandbox provider now correctly remounts when changing to a different sandbox app ID, ensuring a clean state for each sandbox instance.

APR 10 FIX - Fixed broom icon display and clear button behavior in sandbox chat

Fixed the broom icon in the sandbox chat header by replacing it with a properly rendered SVG component. The clear conversation button now correctly disables when there are no messages instead of checking for root conversation state. Also removed debug output that was accidentally displaying auto-scroll state in the chat header.

APR 10 FIX - Fixed color of creativity (temperature) slider in sandbox

Corrected the visual appearance of the creativity/temperature slider in the sandbox interface by updating its color from the incorrect red-500 to the proper red-v2-500 theme color. This ensures consistent styling with the application’s design system.

APR 10 FIX - Fixed sandbox loading states, error handling, and retry functionality

Fixed multiple issues in the sandbox interface: added proper loading skeleton states while messages are being processed, improved error handling to display user-friendly error messages with a retry button instead of raw error text, and implemented retry functionality for failed responses. Users can now click a retry link when errors occur to resubmit their request without starting over.

APR 09 MODEL - Added 3 new Together.ai models: DBRX-Instruct, DeepSeek Coder 67B, and Qwen 1.5-32B

Added support for three new Together.ai models: DBRX-Instruct (32K context), DeepSeek Coder 67B (4K context), and Qwen 1.5-32B (32K context). DBRX-Instruct is a mixture-of-experts model optimized for few-turn interactions, while DeepSeek Coder 67B specializes in code-related tasks. All models support streaming and penalty parameters, with competitive pricing ranging from 0.8e-6 to 1.2e-6 per token.

APR 09 MODEL - Updated router version to include dbrx-instruct model support

Updated the model router to version pulze-v0.1-20240409-alpha1, which adds support for the dbrx-instruct model. This new version replaces pulze-v0.1-20240405-alpha1 and expands the available model options for API users across both development and production environments.

APR 08 IMPROVEMENT - Updated production router model to version pulze-v0.1-20240405-alpha1

Upgraded the model router to a newer version (pulze-v0.1-20240405-alpha1, released April 5, 2024) from the previous version (pulze-v0.1-20240330-alpha1, released March 30, 2024). This update includes improvements to the model scoring and routing logic for better performance and accuracy in selecting the optimal model for each request.

APR 05 IMPROVEMENT - Improved RAG response quality by updating system prompt guidelines

Enhanced the RAG system prompt to prevent explicit context citations and redundant source mentions. The prompt now instructs the model to naturally incorporate context information without directly referencing it, while still maintaining the requirement to stay within provided context boundaries.

APR 05 IMPROVEMENT - Improved model routing using last conversation message

Enhanced model routing accuracy by using the last message in conversation history for route selection instead of the full conversation context. This change optimizes model selection by focusing on the most recent user query, particularly important for multi-turn conversations where the latest message is most relevant for determining the appropriate model.

APR 05 IMPROVEMENT - Improved URL data source handling with direct redirects

Enhanced the custom data file download endpoint to properly handle URL-type data sources by redirecting users directly to the original URL. Previously, all data sources were treated as file references, but now the system intelligently routes URL data sources to their original locations while maintaining signed URL generation for file-based sources.

APR 05 NEW - Added support for URLs as data sources for custom knowledge bases

Users can now provide URLs alongside file uploads when creating custom knowledge bases for their apps. The system accepts both file uploads and URLs simultaneously, with URLs being automatically scraped and processed as HTML content. Each URL is stored separately and can be refreshed by re-uploading, providing more flexibility in maintaining up-to-date knowledge bases.

APR 05 MODEL - Added GPT-3.5 Turbo 0125 and updated context window for GPT-3.5 Turbo

Added support for GPT-3.5 Turbo 0125 model with 16,385 token context window, featuring improved format accuracy and fixes for non-English function calls. The model supports streaming, JSON mode, and function calling. Additionally updated the base GPT-3.5 Turbo model’s context window to 16,385 tokens. Pricing is set at

0.0005/1K prompt tokens and

0.0015/1K completion tokens.

APR 04 MODEL - Added Cohere Command-R+ model with 128K context window

Added support for Cohere’s Command-R+ model, which features a 128,000 token context window and is optimized for complex RAG workflows and multi-step tool use. The model supports streaming, penalties, and multiple completions (n), with token costs of

0.003/1K for prompt tokens and

0.015/1K for completion tokens.

APR 04 MODEL - Together AI: Add Nous Hermes 2 Mistral, deprecate WizardCoder and 50+ models

Added Nous Hermes 2 Mistral 7B DPO model (32K context window) with improved benchmark performance across AGIEval, BigBench Reasoning, GPT4All, and TruthfulQA. Deprecated over 50 Together AI models including WizardCoder 15B, Falcon series, Llama 2 series, Qwen series, and various other models. Also updated Qwen 1.5 72B Chat’s context window to 32K tokens.

APR 03 NEW - Added secure file download support for custom data uploads

Added support for securely downloading custom data files uploaded to apps through signed URLs. Files are now accessible through a new endpoint ‘/custom-data//files/’ which generates temporary signed URLs valid for 10 minutes, ensuring secure access to uploaded content while preventing unauthorized downloads.

March 2024

MAR 29 MODEL - Added Claude-3 Haiku and Cohere Command-R models

Added support for Claude-3 Haiku (200K context window) and Cohere Command-R (128K context window). Claude-3 Haiku is Anthropic’s fastest model optimized for near-instant responses, while Command-R is Cohere’s new instruction-following model with improved capabilities for code generation, RAG, and tool use. Additionally, Cohere’s Command-Light model now supports chat completion API.

MAR 28 FIX - Fix default chunk size for document processing in RAG store

Fixed the document splitter to use explicit default values instead of relying on unspecified defaults. Documents are now split into chunks of 500 tokens with a 200-token overlap between consecutive chunks, ensuring consistent and predictable behavior when processing documents for retrieval-augmented generation.

MAR 26 TECHNICAL - Removed categorizer and knowledge graph functionality

Removed the text categorization system that automatically classified prompts into 20 predefined categories (like Arts & Science, History). This change simplifies the model selection architecture by removing the category-based routing and knowledge graph dependencies. API functionality remains the same, but model selection no longer uses categorical classification.

MAR 26 FIX - Fixed model selection scoring for latency and cost optimization

Improved the accuracy of model selection by fixing how latency and cost scores are calculated during model ranking. Previously, the scoring system was using inverted values incorrectly, which could lead to suboptimal model selections. The fix ensures that models with lower latency and cost are properly prioritized during the selection process.

MAR 26 FIX - Fixed RAG context calculation in playground model ranking

Fixed an issue where RAG context size was being incorrectly calculated due to only using the last message instead of the full conversation history. Now properly processes the entire message history and converts it to the correct prompt format before model ranking, ensuring more accurate context size calculations and better model recommendations.

MAR 25 FIX - Fix case-sensitive email matching in signup whitelist

MAR 25 FIX - Fixed signup whitelist validation for production environment

MAR 25 FIX - Fixed signup whitelist validation in production environment

Fixed an issue with the signup whitelist validation where users couldn’t register even when whitelisted. The fix corrects the function call to properly check if an email is on the whitelist before allowing registration in the production environment. Non-whitelisted users will still be directed to join the waitlist.

MAR 25 FIX - Fixed signup whitelist validation for production environment

Fixed an issue with the signup whitelist validation by correcting the reference to the signup whitelist checker (crud_signup_whitelist). This ensures proper validation of new user registrations against the whitelist in the production environment, maintaining controlled access during the waitlist period.

MAR 24 NEW - Added email whitelist system for controlling signup access

Introduced a new signup whitelist system that allows specific email addresses to register accounts in production, even when general signups are closed. This feature adds granular control over user registration through a new signup_whitelist database table that stores approved email addresses. Non-whitelisted users will be directed to join the waitlist.

MAR 23 FIX - Fixed model response handling for edge cases with empty or error responses

Improved error handling when models return unexpected responses, specifically for cases where Mistral returns a 200 status code but no completion tokens, or when responses contain an error finish reason. This ensures more accurate health monitoring and error reporting for model API calls.

MAR 22 TECHNICAL - Improved model monitoring and latency tracking system

Enhanced the model monitoring system to track more detailed latency metrics, including p50, p90, and p99 percentiles for per-token latency. Removed the legacy Redis-based EWMA latency tracking in favor of more accurate percentile-based measurements. This change provides more reliable performance monitoring and optimization capabilities.

MAR 22 IMPROVEMENT - Enhanced playground model ranking with RAG context integration

Improved model ranking in the playground by incorporating RAG (Retrieval-Augmented Generation) context when custom data is present. The system now hydrates prompts with relevant context from your custom data before ranking models, resulting in more accurate model recommendations based on the full context of your queries.

MAR 22 FIX - Fixed streaming responses to properly indicate completion with [DONE] marker

Fixed streaming responses across all providers (OpenAI, Anthropic, Google, Cohere) to properly end with a ‘data: [DONE]’ message. This change ensures clients can reliably detect when a streaming response has completed. Additionally updated the playground endpoint to use the correct ‘text/event-stream’ content type for Server-Sent Events.

MAR 22 FIX - Fixed rerun button being enabled with no chat history in sandbox

Fixed an issue where the rerun button in the sandbox chat could be enabled even when there were no chat items in the conversation history. The button is now properly disabled when the chat is empty (items.length === 0), preventing users from attempting to rerun prompts when there’s no conversation context available.

MAR 21 IMPROVEMENT - Added visual icons for ratings in explanation box

Added five new rating icons to enhance the visual appearance of the explanation box: rocket, stars, speedometer-fast, face-neutral, and thumbs-down. These icons complement the existing thumbs-up icon to provide more expressive rating options in the user interface.

MAR 21 IMPROVEMENT - Dynamic color-coded model scores based on performance ranges

Model scores now display with color-coded backgrounds that reflect performance levels: gray for N/A (0), gradual color progression for scores 0.5-0.97, and a vibrant gradient (orange/pink/purple) for top scores above 0.97. The ‘highlighted’ prop was removed in favor of this standardized, score-based visual system. Score displays are now consistent across all views including the sandbox response comparison and explanation popovers.

MAR 21 FIX - Fixed explanation popover positioning and interaction issues

MAR 20 FIX - Improved model fallback behavior and API key handling

Fixed how the system handles API key issues and model retries. When a provider’s API keys are unavailable or invalid, the system now smoothly skips to the next available model instead of failing. Also improved the retry logic to better handle temporary failures and continue with alternative models when appropriate.

MAR 20 NEW - Add model selection explanation feature with detailed scoring breakdown

Added a new feature that provides transparency into model selection decisions by exposing detailed scoring metrics (Quality, Latency, Cost) for each model candidate. The system now tracks and returns whether each model was actively scored, and includes context about similar historical prompts to explain model recommendations. This helps users understand why specific models are suggested for their use case.

MAR 20 IMPROVEMENT - Enhanced model selection and failover chain behavior

Improved model filtering and selection logic to better handle both explicit model requests and failover chains. When users specify a model, the system now properly includes all failover models in the candidate list while maintaining the original selection preferences. The scoring and ranking system has been streamlined to provide more accurate model recommendations while respecting project whitelists.

MAR 19 FIX - Fixed RAG compatibility issues with sandbox apps and empty responses

Fixed two issues with RAG (Retrieval Augmented Generation) functionality: 1) RAG now properly works with sandbox applications by checking custom data against the parent app, and 2) RAG engine now gracefully handles cases where no relevant document chunks are found, returning the original query prompt instead of failing. These changes improve reliability when using custom data with sandbox environments.

MAR 19 IMPROVEMENT - Improved model scoring for RAG-enhanced prompts

Enhanced model scoring system to properly evaluate RAG (Retrieval-Augmented Generation) hydrated prompts. The system now correctly processes and scores both standard prompts and chat messages when using RAG, ensuring more accurate evaluation of model performance with retrieved context. This includes preserving the hydrated prompt in both regular prompt and chat message formats for consistent scoring.

MAR 19 MODEL - Updated model scoring system to pulze-v0.1-20240313-alpha1

Upgraded the internal model scoring system from pulze-v0.1-20240312 to pulze-v0.1-20240313-alpha1. This alpha version includes improvements to model selection and routing algorithms that may affect which models are recommended for your requests.

MAR 19 IMPROVEMENT - Removed yellow background highlighting from test log entries

Test log entries in the logs view no longer display with a yellow warning background color. Previously, rows marked as test entries were highlighted with a yellow background (bg-alert-warning-light), but this visual distinction has been temporarily removed for a cleaner, more uniform appearance in the logs table.

MAR 18 IMPROVEMENT - Enhanced RAG engine with improved prompt handling and smarter document retrieval

Improved RAG (Retrieval Augmented Generation) functionality with better handling of both chat and non-chat completion requests. The system now intelligently retrieves relevant documents based on the most recent query in chat history or the full prompt for non-chat requests. Additionally, updated the Q&A system prompt template to generate more natural responses that seamlessly incorporate retrieved information without explicitly referencing sources.

MAR 18 IMPROVEMENT - Enhanced RAG functionality now triggers across all organization apps

Improved RAG (Retrieval Augmented Generation) engine to trigger whenever an organization has custom data in any app, not just the current app. Previously, RAG was only triggered when the specific app being queried had custom data. This change makes custom data accessible across all apps within an organization, providing more consistent and comprehensive responses.

MAR 18 NEW - Enhanced playground sandbox features and app model configuration

Improved the playground sandbox functionality with new app model configuration capabilities. Test models are now non-streamable by default, and the app model configuration system has been updated to better handle sandbox environments. The update includes simplified app creation with default model weights and policies, and improved handling of sandbox modes for testing and development.

MAR 18 FIX - Fixed Anthropic system messages handling to improve chat conversation flow

Fixed how system messages are handled in Anthropic chat conversations by merging system instructions into the first user message rather than sending them separately. This improves compatibility with Anthropic’s chat models and ensures system instructions are properly incorporated into the conversation context.

MAR 18 FIX - Fixed handling of emojis and special characters in Google AI streaming responses

Fixed an issue where emoji characters and other special Unicode characters would cause errors or display incorrectly in Google AI model responses. The fix improves UTF-8 character handling in the streaming response parser for both Google AI and Google Chat models, ensuring proper decoding of all Unicode characters including emojis.

MAR 18 IMPROVEMENT - Updated LLM model selection weights to prioritize quality by default

Modified default model selection weights to prioritize quality (1.0) over cost (0.0) and latency (0.0). This change makes quality the sole default consideration when automatically selecting language models, which should result in higher quality responses out of the box.

MAR 18 FIX - Update AI21 Labs model pricing for J2-Ultra, J2-Mid, and J2-Light models

Updated token pricing for AI21 Labs J2 series models. J2-Ultra now costs

0.002 per prompt token and

0.01 per completion token, J2-Mid costs

0.00025 per prompt token and

0.00125 per completion token, and J2-Light costs

0.0001 per prompt token and

0.0005 per completion token.

MAR 18 FIX - Fixed handling of invalid finish_reasons from language models

Fixed an issue where the API would crash when language models returned unexpected finish reasons. Instead of raising an exception, the API now logs these cases and continues processing, improving reliability when working with models that may return non-standard completion statuses.

MAR 18 FIX - Fixed sidebar collapsing unexpectedly on first page load

Fixed an issue where the sandbox sidebar would automatically collapse on the first visit to the page. The sidebar now remains expanded by default when no previous user preference is stored, instead of defaulting to a collapsed state. Additionally, improved spacing in the compared response view by adding left padding.

MAR 14 FIX - Fixed incomplete response handling in Google models

Fixed an issue where incomplete JSON chunks from Google model responses could cause the application to crash. The system now gracefully handles partial response chunks from both Google’s text completion and chat completion APIs, ensuring more reliable and stable interactions with Google’s language models.

MAR 13 NEW - Added GCS support for custom data uploads in RAG component

Enhanced the custom data upload functionality to store files in Google Cloud Storage (GCS) instead of directly in the database. Files are now organized in a structured format (uploads/org_id/app_id/filename) within a dedicated RAG component bucket, with duplicate file uploads prevented based on size comparison. The system now marks uploaded files with a ‘pending’ status for subsequent indexing.

MAR 13 MODEL - Updated pricing for Cohere Command model

Updated token pricing for the Cohere Command model. Input (prompt) tokens now cost

0.0005 per 1K tokens (down from previous rate), and output (completion) tokens cost

0.0015 per 1K tokens.

MAR 13 FIX - Fixed textarea auto-grow behavior and positioning

Fixed textarea auto-grow functionality to properly adjust height when content changes. The auto-grow now triggers whenever the textarea value changes using React’s useEffect hook instead of on specific keyboard events. Also fixed the positioning of clear and end icons in textareas to appear in the top-right corner instead of vertically centered, and added max-height constraints (50vh) to prevent textareas from growing too large.

MAR 13 IMPROVEMENT - Updated icon assets in Playground interface

Improved visual consistency of copy and delete icons in the Playground interface. Updated icon dimensions from 24x24 to 20x20 pixels with refined SVG paths for better rendering. Added new dark-themed variants (copy-dark.svg and delete-dark.svg) to support different UI themes, ensuring better visibility across light and dark modes.

MAR 12 IMPROVEMENT - Updated model scoring system with performance adjustments

Updated model performance scores across the LLM lineup, including significant adjustments for newer models like Claude 3 Opus (8.1), Claude 3 Sonnet (8.2), and GPT-4-0125-preview (9.0). Added new models Gemma-7b-it and adjusted scoring methodology to prepare for upcoming ‘Why this model?’ feature integration.

MAR 12 NEW - New Sandbox UI/UX with Test Request Tracking

Added new sandbox environment capabilities with dedicated test request tracking. Database changes now include an ‘is_test’ flag to mark requests performed from sandbox apps, enabling better separation between testing and production usage. This update improves the app settings structure with clearer model organization, including separate tracking of active base models, custom models, and failover chains.

MAR 12 IMPROVEMENT - Remove optional policy to ignore unsupported model features

Removed the ignore_unsupported_features policy option that previously allowed requests to proceed even when using unsupported model features. The API will now always validate that requested features are supported by the chosen model, providing clearer error messages when incompatible features are requested.

MAR 11 FIX - Closed new user signups temporarily with waitlist option

Temporarily disabled new user registrations in the production environment. New users attempting to sign up will now receive a message directing them to join the waitlist. Existing users are unaffected and can continue to use the platform normally.

MAR 10 MODEL - Added Gemma 7B Instruct model to Groq and updated Mistral-7B context window

Added support for Gemma 7B Instruct model on Groq, featuring an 8,192 token context window and streaming capability. The model is open-source and based on the same technology used to create Gemini models. Additionally, updated the context window for Together AI’s Mistral-7B-Instruct-v0.1 to 8,192 tokens.

MAR 10 NEW - Added streaming support for MistralAI models

Enabled streaming capabilities for all MistralAI models by switching to OpenAI’s implementation. This update removes support for ‘n’ concurrent completions but adds streaming functionality, allowing for real-time response generation with MistralAI models. Database schema has been updated to reflect these capability changes.

MAR 10 MODEL - Updated Cohere Command and Command-Light model features and pricing

Updated token pricing and features for Cohere Command and Command-Light models. Command model now costs

0.001/1K prompt tokens and

0.002/1K completion tokens, while Command-Light costs

0.0003/1K prompt tokens and

0.0006/1K completion tokens. Both models now support the ‘n’ parameter for generating multiple completions.

MAR 08 NEW - Added streaming support for Cohere language models

Added support for streaming responses when using Cohere language models, allowing for real-time text generation. The implementation handles streamed chunks of text with proper error handling and includes support for temperature, max tokens, frequency penalty, and presence penalty parameters. Note that logit bias, top_p, and best_of parameters are not supported by Cohere’s API.

MAR 07 MODEL - Added self-hosted Gemma-7B model support via Pulze provider

Added support for Google’s Gemma-7B model through Pulze’s self-hosted infrastructure. The model features a 4,096 token context window and is available at no cost (0 tokens/request). This open-source model is configured for completion-only tasks (no chat, streaming, or function calling support) and is integrated through Pulze’s infrastructure.

MAR 07 NEW - Added streaming support for Google AI models

Added support for streaming responses from Google AI models using server-side events (SSE). This implementation allows real-time, token-by-token streaming responses with all supported parameters including temperature, top_p, max tokens, and presence/frequency penalties. The feature maintains compatibility with OpenAI’s streaming format while adding proper safety attribute handling and usage calculation for Google’s models.

MAR 07 FIX - Fixed model provider display and counter in app settings

Fixed an issue where unchecked model providers were not automatically collapsing in the app settings interface. Also corrected the provider counter to accurately display the number of active base models and custom models separately, rather than showing a combined total.

MAR 07 MODEL - Updated Google's chat-bison-32k model to version 002 with adjusted capabilities

Updated Google’s chat-bison-32k model from version 001 to 002. The model no longer supports generating multiple completions (n>1) in a single request. This change affects both new and existing model versions to maintain consistency with Google’s API capabilities.

MAR 07 FIX - Fixed app creation to correctly set custom names

Fixed an issue where custom app names weren’t being properly set during app creation. Previously, the app description field was incorrectly used instead of the name field. Now, users can properly set custom names for their apps during creation, with random names still being generated as a fallback when no name is provided.

MAR 07 FIX - Removed image size restriction and improved Edit Profile page display

Fixed the Edit Profile page to display user profile images of any size instead of blocking images larger than 1 megapixel. Also improved the authentication provider information banner to support additional provider types (Google2 and test tokens) and enhanced the copy-to-clipboard functionality with better tooltips that show ‘Copied to clipboard’ after successful copying.

MAR 07 FIX - Fixed playground not clearing messages when switching contexts

Fixed a bug where playground messages were not properly cleared when changing apps or contexts. The issue was caused by missing the chatItems dependency in the component’s dependency array, which prevented the playground from resetting when the conversation state changed.

MAR 06 IMPROVEMENT - Improved model scoring for unscored LLMs

Modified model scoring algorithm to handle unscored models more effectively. Instead of zeroing out scores for models without quality metrics, the system now incorporates their latency and cost metrics into the final ranking. This change ensures more balanced model selection based on all available performance metrics, even when quality scores are unavailable.

MAR 06 NEW - Added sandbox/ephemeral app environments for testing and configuration

Users can now create sandbox versions of their apps for testing and configuration purposes. Each user can have their own sandbox instance of an app, allowing safe experimentation with settings and configurations without affecting the production app. Added new fields including app name, description, and sandbox relationships in the database schema.

MAR 06 IMPROVEMENT - Removed NVIDIA A100 GPU node pool configurations

Removed support for NVIDIA Tesla A100 GPU node pools in both production and development environments. The following configurations are no longer available: single A100 (a2-highgpu-1g), dual A100 (a2-highgpu-2g), quad A100 (a2-highgpu-4g), and octa A100 (a2-highgpu-8g) machine types in the us-west1-b location. Users should utilize alternative GPU options such as NVIDIA L4 for their workloads.

MAR 06 IMPROVEMENT - Renamed 'Sandbox' tab to 'Playground' in app management details

The ‘Sandbox’ tab in the app management details section has been renamed to ‘Playground’ for better clarity. This changes the navigation label and URL routing, but the functionality remains the same. Users will now see ‘Playground’ instead of ‘Sandbox’ when viewing application details.

MAR 06 IMPROVEMENT - Reorganized app management tabs and temporarily disabled sandbox mode toggle

Moved the Sandbox tab to the first position in the app management navigation and removed the Playground tab. Temporarily hidden the Sandbox Mode toggle switch while preserving the underlying functionality for future use. Added copy-to-clipboard functionality for model namespaces to make it easier to reference model names. Also updated test configurations to use qwen1.5-0.5b-chat model instead of mistral-tiny and increased test timeout from 10 to 20 seconds.

MAR 06 IMPROVEMENT - Reorganized App management tabs and added Smart Router icon support

Reordered the App management detail tabs to prioritize Flowz, Playground, and Sandbox before Logs. Added a new Smart Router icon to the icon library and enabled icons to be displayed in card selector components. Updated the Installation panel to include a link to the Sandbox tab for easier navigation between related features.

MAR 05 IMPROVEMENT - Improved model scoring with quantile normalization for more balanced results

Enhanced the model scoring algorithm by switching from min-max to quantile normalization for cost and quality metrics. This change provides more balanced and robust scoring across models by reducing the impact of outliers, resulting in more reliable model recommendations based on cost, quality, and latency preferences.

MAR 05 FIX - Fixed Anthropic API compatibility by removing extra message parameters

Fixed an issue where requests to Anthropic’s API included additional message properties that could cause compatibility issues. Messages are now properly formatted to only include the required ‘role’ and ‘content’ fields when making requests to Anthropic’s models, ensuring better API compatibility.

MAR 05 IMPROVEMENT - Updated model optimization presets for more predictable routing behavior

Modified the model optimization presets to produce more deterministic results. The cost preset now prioritizes cost at 1.0 with minimal quality consideration (0.1), the latency preset focuses solely on speed at 1.0 with minimal quality (0.1), and the quality preset maximizes quality at 1.0 with slight latency consideration (0.1). These adjusted weights reduce the influence of competing factors, making model selection more predictable and aligned with the chosen optimization goal.

MAR 05 MODEL - Updated Pulze model version to pulze-v0.1-20240305

Updated the Pulze model scoring system to version pulze-v0.1-20240305 (released March 5, 2024). This version update applies to both development and production API environments and may include improvements to model performance evaluation and routing decisions.

MAR 05 IMPROVEMENT - Pricing page now redirects to home page

The /pricing route has been updated to redirect users directly to the home page (/). Users visiting the pricing page will now automatically be taken to the main landing page instead.

MAR 04 MODEL - Added GPT-4 Turbo Preview and Claude-3 Models (Opus & Sonnet)

Added support for OpenAI’s GPT-4 Turbo Preview (128K context window) and Anthropic’s Claude-3 models: Opus and Sonnet (both with 200K context windows). Claude-3 Opus offers highest performance for complex tasks with 15/75μ¢ token pricing, while Sonnet provides balanced performance at 3/15μ¢ per token. Additionally, Claude-2 models were updated to use the messages API format, with claude-2 now pointing to claude-2.1 as the default target.

MAR 04 FIX - Fix scoring accuracy in Playground responses

Fixed an issue where response scores in the Playground were not being calculated correctly due to incorrect request type handling. The fix updates the abstract provider engine to use the correct request table schema, ensuring accurate response scoring for all playground interactions.

MAR 04 MODEL - Deprecated all Replicate provider models

All models from the Replicate provider have been deprecated as of March 4, 2024. These models will no longer be available for use through the API. This change affects all previously available Replicate-hosted models.

MAR 04 IMPROVEMENT - Added streaming support status checks for specific providers

Improved streaming support verification to provide clearer error messages for unsupported providers. Streaming is now explicitly supported for OpenAI, Groq, Together, and OctoAI providers, with helpful error messages for other providers indicating future support is planned. The system now checks both model-level streaming capability and provider-level support.

MAR 04 IMPROVEMENT - Improved streaming response handling with correct model namespace display

Fixed streaming responses to consistently show the full model namespace (e.g., ‘test/openai’ instead of just ‘test’) in streaming chunks. This improvement ensures users see the complete and accurate model identifier throughout the entire streaming response, making it easier to track which model is generating the output.

MAR 04 IMPROVEMENT - Model labels now display version information when available

Model labels in the UI now automatically include version information (e.g., “model@version”) when the model has an associated version tag. This makes it easier to distinguish between different versions of the same model at a glance. The version information is appended with an @ symbol after the model name.

MAR 02 IMPROVEMENT - Improved model selection by using provider-specific latency benchmarks

Enhanced model scoring system to use provider-specific latency benchmarks instead of a default high value. This improvement makes model selection more accurate by using real-world latency data from each provider, or falling back to provider-specific maximums when exact data isn’t available. Also added quantile normalization for latency scores to better handle variations between different providers.

MAR 01 NEW - Add streaming support for OpenAI, TogetherAI and Groq models

Added support for streaming responses from language models across OpenAI, TogetherAI, and Groq providers. The feature works with both chat completions and text completions endpoints, allowing real-time token-by-token responses. Streaming responses are automatically handled with proper background task cleanup and token consumption tracking.

MAR 01 FIX - Improved model scoring and selection logic

Enhanced the model scoring mechanism to better handle different Pulze synthetic models (PULZE, PULZE_V0, PULZE_V01) with specialized scoring strategies. The update includes a more robust scoring system that considers quality, latency, and cost weights when selecting models, and properly handles both synthetic and fully qualified models. This results in more accurate model selection based on user-specified preferences.

MAR 01 NEW - Added token price optimization tool for cost calculations

Introduced a new tool that helps convert and optimize token costs between different units (per 1K or 1M tokens) into precise decimal values without approximation errors. The tool accepts multiple price inputs (including scientific notation like 1.5e-6) and automatically calculates the optimal decimal places needed for accurate per-token pricing. This enables more precise cost tracking and billing calculations for token usage.

MAR 01 FIX - Fixed optimistic update using incorrect value in app mutation hook

Fixed a bug in the app update hook where optimistic UI updates were using the old value instead of the new value. This ensures that when updating app settings, the interface immediately reflects the correct updated state rather than temporarily showing stale data before the server response arrives.

February 2024

FEB 29 MODEL - Added Replicate & Mistral models, deprecated older Replicate models

Added 9 Replicate models including Llama 2 (7B/13B/70B base and chat variants), Mistral-7B, Mistral-7B-Instruct v0.2 (32K context), and Mixtral-8x7B-Instruct (32K context). Also added Mistral Large model and deprecated older Replicate models (mistral-7b-instruct-v0.1, mistral-7b-openorca, codellama-13b). New models support streaming and most feature extended context windows up to 32K tokens.

FEB 29 IMPROVEMENT - Improved model scoring system with enhanced model matching

Enhanced the model scoring system to support both namespace and direct model name matching when determining model quality scores. Updated to use a new modelscores server endpoint (port 8888) with a simplified scoring response format that provides direct model-to-score mappings rather than the previous complex temperature-based scoring system.

FEB 28 MODEL - Add Groq provider with Llama-2-70B and Mixtral-8x7B models

Added support for Groq as a new model provider with two powerful models: Llama-2-70B-Chat (4,096 token context) and Mixtral-8x7B-Instruct (32,768 token context). Both models support streaming and multiple completions (n>1), with Mixtral offering significantly larger context window and lower token costs. These models are integrated through Groq’s OpenAI-compatible API.

FEB 27 FIX - Fixed error handling for blocked Gemini Pro responses

Fixed an issue where the API would crash when Gemini Pro blocked a response due to safety filters. The system now properly handles cases where the model returns no text content and missing token counts, returning an empty response instead of throwing an error.

FEB 27 MODEL - Model updates: Deprecations, name standardization, and GPT-4 Turbo addition

Deprecated gooseai/gpt-neo-20b, gooseai/gpt-j-6b, and huggingface/falcon-40b-instruct models. Standardized model names for CodeLLama series (removing ‘hf’ suffix) and LLaMA variants. Added GPT-4 Turbo with 128K context window, supporting functions, JSON output, and streaming. Model includes competitive pricing at

0.01/1K prompt tokens and

0.03/1K completion tokens.

FEB 23 FIX - Fixed model selection in Failover App Mode to support any model configuration

Fixed an issue where Failover App Mode was too restrictive with model selection. Users can now specify any model in failover chains, including custom configurations. The error message has been improved to clearly indicate when no failover models are selected versus when no models fit the context window requirements. Additionally, the test provider now properly handles model namespace resolution in failover scenarios.

FEB 23 NEW - Add date, time, and datetime variables in prompts

Added support for dynamic time-based variables in prompts using date, time, and datetime placeholders. When used in prompts, these variables are automatically replaced with the current date (YYYY-MM-DD), time (HH:MM:SS), and datetime (YYYY-MM-DDTHH:MM:SS) respectively. This allows for creating prompts that include current temporal information without manual updates.

FEB 23 IMPROVEMENT - Improved token estimation accuracy and response speed for model selection

Enhanced token estimation accuracy by adding a 1.5% safety margin to prevent context window errors. Optimized performance by caching the token encoder, reducing request latency by 250ms. This improvement helps ensure more reliable model selection based on context size requirements while maintaining faster response times.

FEB 23 IMPROVEMENT - Display parent model and prompt information for custom models

Enhanced model information display to show which base model and prompt custom models are derived from. This adds transparency by exposing the parent_id, parent model reference, and associated prompt_id in the model data schema, helping users better understand their custom model configurations.

FEB 23 FIX - Remove restrictions on internal model access and enable KG model availability

Removed access restrictions that previously limited certain models (like PULZE_V01) to internal users only. All users can now access these previously restricted models, including the new Knowledge Graph (KG) model, regardless of their organization status.

FEB 23 FIX - Improved error handling and consistency for cost controls and Flowz execution

Enhanced error reporting by standardizing error codes across the platform and adding specific error codes like ‘E_REQUEST_COST_EXCEEDED’ for cost control limits and ‘E_RL_PULZE’ for capacity limits. Increased Flowz execution limit from 10 to 20 iterations and added automated monitoring for potential infinite loops. Also improved internal error tracking for loop detection in Flowz.

FEB 23 IMPROVEMENT - Added visual success feedback when copying to clipboard

The copy-to-clipboard functionality now displays a green checkmark icon for 1 second after successfully copying content, providing immediate visual confirmation to users. This replaces the previous behavior where only a toast notification was shown (and only in some cases). The improvement applies to all copy actions throughout the application including API keys and prompt IDs.

FEB 23 IMPROVEMENT - Increased cost precision in logs table from 6 to 8 decimal places

Cost values in the logs table now display with 8 decimal places instead of 6, providing more precise cost tracking for API calls. This improved precision is especially useful for tracking costs of very inexpensive API calls where minute differences matter.

FEB 23 IMPROVEMENT - Updated Python code examples to use modern OpenAI SDK syntax

Updated Python integration examples to use the newer OpenAI SDK pattern with the OpenAI client class instead of deprecated module-level methods. The code now initializes a client with OpenAI(api_key=..., base_url=...) and uses client.completions.create() and client.chat.completions.create() instead of openai.completions.create(). Also updated the default model identifier from ‘pulze-v0’ to ‘pulze’ and corrected the base URL to include a trailing slash.

FEB 22 FIX - Fixed role assignment in Google Chat model responses

Fixed an issue where Google Chat model responses were incorrectly labeled with ‘user’ role instead of ‘assistant’ role in the API response. This ensures that model-generated content is properly attributed in chat conversations. The fix also includes database schema updates to handle Together AI model namespaces more efficiently.

FEB 22 FIX - Fixed response handling for Together AI completions and OpenAI error handling

Fixed an issue where Together AI completion responses weren’t properly handling choice indexes. Also improved OpenAI error handling by changing from ServiceUnavailableError to UnprocessableEntityError, and ensured proper JSON response parsing for OpenAI completions. The fix ensures more reliable response handling and better error reporting for both providers.

FEB 22 MODEL - Added OctoAI model support with Llama 2, CodeLlama, Mistral, and Mixtral variants

Added support for 9 new open-source models via OctoAI, including Llama-2 (13B, 70B), CodeLlama (7B, 13B, 34B, 70B), Mistral-7B, Mixtral-8x7B, and Nous-Hermes-2-Mixtral. All models support streaming, JSON output, and penalty configurations. Notable context windows include 32K tokens for Mixtral models and 16K for CodeLlama-34B, with competitive pricing starting from $0.000025 per token.

FEB 22 TECHNICAL - Updated OpenAI SDK to version 1.12.0 with new API patterns

Updated OpenAI SDK implementation to version 1.12.0, adopting new API patterns like openai.chat.completions.create() instead of the deprecated openai.ChatCompletion.create(). Changes include updated error handling patterns (e.g., BadRequestError instead of InvalidRequestError) and base URL configuration using base_url instead of api_base. This update maintains compatibility with both OpenAI and GooseAI providers.

FEB 22 FIX - Reset email alerts when updating organization spending limits

Fixed email notification behavior when updating organization spending limits. When soft or hard spending limits are modified, the system now automatically resets the alert status, allowing notifications to be sent again when crossing the new thresholds. This ensures organizations receive proper notifications when reaching their updated spending limits.

FEB 22 NEW - Added A100 GPU node pools with 4 and 8 GPU configurations

Expanded A100 GPU infrastructure by adding new node pool configurations with 4-GPU (a2-highgpu-4g) and 8-GPU (a2-highgpu-8g) machines in both development and production environments. Production now includes the complete range of A100 configurations: single GPU (a2-highgpu-1g), 2-GPU (a2-highgpu-2g), 4-GPU, and 8-GPU variants, all deployed in us-west1-b with corresponding local SSD counts matching GPU counts. Development environment also received the 4-GPU and 8-GPU configurations, with minimum node count requirements removed from existing 1-GPU and 2-GPU pools.

FEB 22 NEW - Added Nvidia A100 GPU node pools for high-performance computing workloads

Added two new GPU-enabled node pool configurations to the API infrastructure: a single A100 GPU node pool (a2-highgpu-1g machine type with 1 local SSD, minimum 1 node) and a dual A100 GPU node pool (a2-highgpu-2g machine type with 2 nvidia-tesla-a100 accelerators and 2 local SSDs, minimum 2 nodes). Both node pools are deployed in the us-west1-b location to support GPU-accelerated workloads.

FEB 22 FIX - Fixed Select dropdown width to prevent overflow issues

FEB 20 NEW - Added test models for API simulation and testing

Added new test models including ‘test-model-default’, ‘test-model-1’ through ‘test-model-5’, ‘test-model-repeat’, and specialized test models for error cases and context window testing. These models simulate LLM behavior without making actual API calls, with configurable token usage (up to 1M tokens), customizable error responses, and a small-context (20 tokens) model for testing limits. Includes a model scheduled for deprecation (test-model-will-deprecate) on Dec 31, 2099, and an already deprecated model (test-model-deprecated) from Feb 2, 2022.

FEB 20 IMPROVEMENT - Splash loader now remembers progress when navigating between pages

The initial splash loader now stores its progress state at the window level, ensuring that if the app reloads or navigates during startup, the progress bar resumes from where it left off instead of restarting from 0%. This enhancement provides better visual feedback during app initialization and reduces confusion when page transitions occur during loading.

FEB 19 FIX - Fixed pricing display and token costs for OpenAI and Together AI models

Updated pricing units and token costs for multiple models. GPT-4 preview models now show correct token costs (adjusted by 10x), Together AI models have updated pricing (adjusted by 1000x), and GPT-3.5 Turbo models now show accurate costs (0.0005/0.0015 per 1K tokens). Added specific price units (tokens/characters) for each model provider, with Google models now explicitly billed per character.

FEB 19 FIX - Fixed request errors for Google Gemini Pro model

Fixed API request errors with Google’s Gemini Pro model by adjusting safety thresholds from ‘BLOCK_NONE’ to ‘BLOCK_ONLY_HIGH’ and removing unsupported frequency/presence penalties. The model endpoint was also updated from streaming to standard content generation, improving reliability and request success rates.

FEB 19 FIX - Fixed log level configuration and added support for critical level

Fixed issue with log level configuration not working properly and added support for the ‘critical’ log level. The logging system now correctly responds to level changes through the internal API endpoint, supporting debug, info, warn, error, and critical levels. Also improved logging implementation across the codebase for better consistency.

FEB 19 NEW - Add failover chain settings for Apps

Added support for configuring model failover chains in applications, allowing users to specify a prioritized sequence of backup models. Users can now define multiple models in a failover chain with specific priority orders, and the system will automatically try alternative models if the primary model fails. This feature can be enabled/disabled per app and includes validation to prevent duplicate models in the chain.

FEB 19 FIX - Fixed autocomplete dropdown not opening when clicking on the field

FEB 19 IMPROVEMENT - Improved toast notification design and app settings documentation

Toast notifications now feature a darker, semi-transparent background (dark gray with 80% opacity) with white text and rounded corners for better visual hierarchy and readability. The App Management Settings page now includes enhanced documentation showing that weights and policies can be configured per-request via headers (X-Pulze-Weights and X-Pulze-Policies), with direct links to documentation. External links now support an optional icon prop to display an arrow (↗) indicator. Success messages for settings updates are now more specific, distinguishing between weight, policy, and benchmark model changes.

FEB 19 FIX - Fixed app data refresh after updating failover model configuration

Fixed an issue where the app details page wouldn’t automatically refresh after updating failover model settings. Now when you save changes to your failover model priority order, the app data immediately updates to reflect the new configuration. Additionally, toast notification positioning has been standardized to bottom-center across all pages for better consistency.

FEB 17 IMPROVEMENT - Added Together AI provider compatibility and improved response handling

Added support for Together AI as a new provider and improved compatibility with providers that don’t specify choice indices in their responses. The system now automatically assigns index values to response choices, ensuring consistent behavior across all providers including Together AI.

FEB 16 FIX - Fixed model list display in documentation and monitoring

Fixed an issue where only active base models were being shown in the documentation and model monitoring tools, instead of the complete list of available base models. Now correctly displays all base models alongside Pulze synthetic models in the documentation table and includes them in monitoring scripts.

FEB 14 FIX - Fix Gemini model role attribution and error handling

Fixed two issues with Gemini Pro responses: messages are now correctly attributed to ‘assistant’ role instead of ‘user’, and the API properly handles cases where finishReason is not present in responses. Added safety checks that raise clear error messages when responses are blocked due to safety filters.

FEB 14 MODEL - Deprecating all MosaicML models effective February 29, 2024

All models from the MosaicML provider will be deprecated on February 29, 2024. Users should migrate their applications to alternative models before this date. This deprecation affects all MosaicML models available through the API.

FEB 13 MODEL - Deprecate replicate/dolly-v2-12b model and add failover chain functionality

Deprecating the replicate/dolly-v2-12b model effective February 1, 2024. Additionally, introduced a new failover chain system that allows models to be called in a priority order when primary models fail, bypassing the SMART router. Apps can now configure multiple backup models with specific priority levels that will be automatically attempted in sequence.

FEB 13 FIX - Added support for additional parameters in Replicate models

Enhanced Replicate model integration by adding support for frequency_penalty, presence_penalty, and top_p parameters. Also improved max_tokens handling by ensuring consistent parameter names (max_length, max_new_tokens, max_tokens) and fixed temperature parameter to default to 0.75 when set to 0. These changes enable finer control over model outputs when using Replicate provider.

FEB 13 FIX - Fixed prompt token calculator making excessive server requests

Fixed an issue where the prompt token calculator was constantly fetching data from the server whenever any prompt data changed. The calculator now only triggers when the actual prompt text changes, significantly reducing unnecessary server requests and improving performance during prompt editing.

FEB 13 IMPROVEMENT - Refactored model creation UI with reusable card selector component

Improved the model creation interface by extracting the card selection UI into a reusable CardSelector component. The new component supports configurable columns, nullable selections, disabled states with custom labels, and automatic first-card selection. Also added a confirmation dialog when closing the custom model creation modal to prevent accidental loss of changes.

FEB 13 NEW - Added search autocomplete component with filtering and keyboard navigation

FEB 12 MODEL - Remove knowledge cutoff date restriction for Pulze and Pulze-v0 models

Removed the July 5th, 2023 knowledge cutoff date restriction from Pulze and Pulze-v0 synthetic LLM expert models. Both models maintain their 8,191 token context windows but can now access more recent knowledge. Pulze remains the latest synthetic model while Pulze-v0 continues as the first synthetic LLM expert model.

FEB 12 MODEL - Added Google PaLM 2 and Gemini Pro models with up to 32K context windows

Added support for Google’s language models including Gemini Pro (32K context) and PaLM 2 models: text-bison (8K context), text-bison-32k (32K context), chat-bison (8K context), and chat-bison-32k (32K context). Several models are marked for deprecation on July 6, 2024, including text-bison@001, text-bison-32k@002, and chat-bison@001. All models support streaming and penalties, with token costs of

0.0000005 per completion token and

0.0000025 per prompt token.

FEB 12 MODEL - Added Together.ai foundation models including Yi-34B, DeepSeek Coder, and Mixtral variants

Added support for Together.ai’s foundation model lineup including Yi-34B-Chat (4K context), DeepSeek Coder 33B (16K context), DiscoLM Mixtral 8x7B (32K context), Platypus2-70B (4K context), MythoMax-L2-13B (4K context), and other variants. Models feature varying capabilities with all supporting streaming and temperature controls, while specialized models like DeepSeek Coder focus on programming tasks.

FEB 12 IMPROVEMENT - Added value labels to dashboard metrics

Enhanced dashboard readability by adding units to metric values. Requests now show ‘5 requests’, latency shows values in seconds (e.g., ‘0.5s’), costs display with dollar signs (e.g., ‘$10.50’), and errors show counts like ‘2 errors’. This improvement makes dashboard statistics more intuitive and easier to understand at a glance.

FEB 12 IMPROVEMENT - Preserve model finish reasons in API responses

API responses now include the original finish reason from the model (like ‘length’, ‘stop’, ‘content_filter’) instead of always returning ‘stop’. This provides more accurate information about why a model stopped generating its response.

FEB 12 IMPROVEMENT - Added animated progress bar to application splash screen

The splash screen now displays a visual progress bar that automatically animates while the application loads. The progress bar uses an exponential decay algorithm (0.96 multiplier with 40ms intervals) to create a smooth loading animation that fills to 100% once initialization completes, providing better visual feedback during app startup.

FEB 12 IMPROVEMENT - Changed default landing page to Apps and reorganized navigation menu

FEB 09 NEW - Added search functionality to all model pricing tables

Added a search bar to filter models in the Model Pricing page, allowing users to quickly find specific models by name or provider. The search functionality uses the same filtering logic as other model tables and displays a ‘no results’ message when no models match the search query. This feature was previously only available on other model tables and is now available on the pricing page as well.

FEB 08 NEW - Tables now support custom React components when displaying empty data

Tables can now display custom React components when they have no data to show, in addition to the existing string-based labels. The noRows prop (renamed from noRowsLabel) now accepts either a string for simple messages or a JSX.Element for custom components like interactive empty states. This enables more sophisticated no-data experiences, such as the NoResults component that shows filtered vs. unfiltered empty states with search reset functionality.

FEB 08 FIX - Fixed inconsistent naming: 'Models Pricing' renamed to 'Model Pricing'

Corrected a naming inconsistency throughout the frontend where ‘Models Pricing’ was used instead of the singular ‘Model Pricing’. This fix updates the sidebar menu item label, icon references, URL route from ‘/models-pricing’ to ‘/model-pricing’, and internal type definitions to use consistent singular form naming.

FEB 08 IMPROVEMENT - Made app name change dialog dismissible by clicking outside

The dialog for changing app names in the app management detail view can now be dismissed by clicking outside the modal or pressing the Escape key. Previously, users were required to either save changes or use the close button to exit the dialog.

FEB 08 IMPROVEMENT - Enhanced model search to include description and URL fields

Improved the frontend model search functionality to search across multiple fields including namespace, description, and URL. Users can now find models by searching for terms in any of these fields, making it easier to discover models based on their descriptions or repository URLs, not just their namespace.

FEB 08 IMPROVEMENT - Added Flowz preview capability from Apps Table List

Flowz can now be previewed directly from the Apps Table List without requiring the full App Management context. The FlowzComponent and FlowzModal have been refactored to accept the app object as a direct prop instead of relying on AppManagementContext, enabling readonly Flowz previews from any table view where apps are listed.

FEB 07 MODEL - Upgrade RAG keyword extraction model from Mistral Tiny to GPT-3.5 Turbo

Enhanced the Retrieval-Augmented Generation (RAG) system by upgrading the keyword extraction model from Mistral Tiny to GPT-3.5 Turbo. This change in the optimized path should provide more accurate keyword extraction while maintaining efficient performance.

FEB 07 IMPROVEMENT - Added informative error message for streaming requests

Added user notification when attempting to use streaming responses. The system now explicitly informs users that streaming is not yet supported by Pulze (even for models that support it natively) and suggests checking back later. Also added validation to check if requested models support streaming and function/tool calls before processing requests.

FEB 07 IMPROVEMENT - Standardized API key error messages and improved error handling

Improved error handling for API key validation with more descriptive error messages. Invalid API keys now return a consistent OpenAI-style error format with clearer guidance, including a link to get valid API keys. API keys must start with ‘sk-’ prefix and will return standardized error responses if invalid.

FEB 07 IMPROVEMENT - Updated Base Models table layout to match design specifications

Redesigned the Base Models table layout with improved visual organization. Model deprecation messages now appear below descriptions instead of in a separate column, providing a cleaner and more intuitive interface. The table structure has been streamlined by removing empty columns and reorganizing content for better readability.

FEB 06 IMPROVEMENT - Increased max tokens limit for Slack chatbot responses to 4000

Set explicit maximum token limit of 4000 tokens for Slack bot responses, ensuring more comprehensive answers while maintaining reasonable response lengths. This improvement helps prevent truncated messages while optimizing the balance between detailed responses and Slack message constraints.

FEB 06 FIX - Fixed max tokens parameter for MosaicML models

Fixed a typo in the MosaicML provider implementation where ‘max_new_tokens’ parameter was incorrectly spelled as ‘max_new_tokes’. This bug was preventing proper token length control for MosaicML model completions.

FEB 05 NEW - Added Slack Pulze App integration

Added support for integrating Slack workspaces with Pulze through a dedicated Slack App. The integration allows teams to install the Pulze app, storing team-specific access tokens, bot IDs, and enterprise settings. This feature includes proper authentication handling and prevents the Slack app from being accidentally deleted like other playground apps.

FEB 05 FIX - Fixed conversion of single message chat completions to prompts

Fixed an issue where single message chat completions were being unnecessarily converted into tagged prompts with role labels ([USER]:, [ASSISTANT]:, etc.). Now, single messages are kept in their original format, while multi-turn conversations maintain proper role tagging structure. This improves prompt clarity and maintains more natural conversation flow for single-message interactions.

FEB 01 IMPROVEMENT - Removed Playground links from landing page navigation and footer

FEB 01 FIX - Fixed organization settings tab navigation and routing

January 2024

JAN 31 NEW - Added support for custom model creation with base model fine-tuning

Users can now create and manage custom models based on existing base models through new API endpoints. The feature includes the ability to specify custom model names and descriptions, with automatic namespace generation to ensure uniqueness. Custom models can be created, listed, and deleted at the organization level, with full integration into existing app configurations through custom model settings.

JAN 31 MODEL - Introduce GPT-4-0125-preview and update OpenAI model pricing

Added support for GPT-4-0125-preview model with 128K context window, offering improved task completion and reduced ‘laziness’. Updated token pricing for GPT-4-1106-preview (0.001¢ prompt, 0.003¢ completion) and GPT-3.5-turbo-instruct (0.0015¢ prompt). The new GPT-4 model supports functions, streaming, JSON mode, and penalty parameters.

JAN 31 NEW - Add benchmark model selection for applications

Added the ability to select a benchmark model for your application in the settings page. The benchmark model serves as a baseline for comparing all requests in terms of score and cost savings. Users can now choose from any available base model in their organization, with the selection displayed showing both the provider and model name for easy identification.

JAN 31 IMPROVEMENT - Reduced tooltip font size for better readability

JAN 29 FIX - Improved model routing behavior and usage reporting

Fixed model routing logic to prevent automatic model hopping when using non-synthetic models (specific model names), ensuring retries stay with the requested model. Enhanced usage reporting with more precise decimal formatting for billing cycle usage (now shows 2 decimal places). Also improved Anthropic Claude prompt formatting by removing extra space after HUMAN_PROMPT.

JAN 29 IMPROVEMENT - Improved tooltip system with better type safety and behavior

JAN 26 MODEL - Updated internal scoring for Pulze synthetic models

Enhanced model scoring system for Pulze’s synthetic models (pulze and pulze-v0.1) by integrating with a new scoring service. The change improves model routing decisions and adds additional access controls to restrict internal models to Pulze employees only. Model scoring is now handled through a dedicated scoring API endpoint.

JAN 26 IMPROVEMENT - Enhanced support for completion models by allowing access to conversation history

Modified the request handling logic to maintain conversation history for completion models when converting between completion and chat completion formats. This improvement enables traditional completion models to have better context awareness across multiple interactions, similar to chat models.

JAN 26 FIX - Improved subscription and payment error handling

Enhanced handling of failed payments and subscription management with clearer error messages. Updated subscription error message to better explain upgrade/downgrade options, and added automated notifications for failed payments. Now sends internal alerts for payment issues in non-production environments.

JAN 26 FIX - Fixed usage tracking alerts and email notifications for token consumption

Fixed a bug where usage tracking emails and alerts weren’t consistently sending when organizations crossed token consumption thresholds. The system now properly tracks and notifies users when they reach soft limits (configurable), hard limits, or percentage-based thresholds (e.g., 80% of quota) of their token allocation. Additionally improved logging and tracking of trial accounts’ usage.

JAN 25 MODEL - Added complete GPT-3.5 model lineup including 16K and instruct variants

Added six GPT-3.5 models to the platform: gpt-3.5-turbo-1106 (16K context), gpt-3.5-turbo-16k, gpt-3.5-turbo-instruct (4K context), gpt-3.5-turbo-0613, gpt-3.5-turbo-16k-0613, and gpt-3.5-turbo-0301. All models support streaming, JSON mode, function calling, and response penalties. The 0613 and 0301 variants are marked for deprecation on June 13, 2024.

JAN 25 IMPROVEMENT - Added local validation for Flowz before server validation

Improved Flowz validation by adding client-side validation that runs before making API requests to the server. This provides faster feedback when Flowz configurations contain errors, and error messages are now displayed directly in the modal interface using a dedicated error component instead of only showing toast notifications.

JAN 25 IMPROVEMENT - Added Jeev Balakrishnan as CTO & Co-Founder to team page

Updated the team page to include Jeev Balakrishnan as CTO & Co-Founder, positioned prominently after the CEO. The update also standardized the founder title formatting from ‘CEO and Co-Founder’ to ‘CEO & Co-Founder’ for consistency across leadership profiles.

JAN 25 IMPROVEMENT - Updated team page to reflect Fabian Baier as Co-Founder

Updated the team page to correctly display Fabian Baier’s title as ‘CEO and Co-Founder’ instead of ‘CEO and founder’. This change ensures accurate representation of the leadership structure on the company’s about page.

JAN 25 FIX - Fixed Last Invoice Date display showing incorrect timestamp

Corrected the ‘Last Invoice Date’ timestamp display in the billing usage section. The date was previously showing an incorrect value due to missing timestamp conversion from seconds to milliseconds. Users will now see the accurate date of their last invoice in the usage summary.

JAN 24 IMPROVEMENT - Added feature tooltips to pricing table

Enhanced the pricing table interface with informative tooltips that explain each feature in detail. Users can now see additional information about features like app limits, LLM routing, Flowz configuration, custom prompts, fine-tuning capabilities, and model selection routing by hovering over feature names. This makes it easier to understand the specific capabilities included in each subscription tier.

JAN 24 IMPROVEMENT - Streamlined user onboarding and profile management

Simplified the user onboarding process by removing Auth0 profile editing restrictions and improving terms acceptance flow. Users can now update their profiles regardless of authentication method, and organizations can accept Terms of Service and Privacy Policy in a single step during organization updates.

JAN 24 IMPROVEMENT - Improved text readability on homepage with increased line spacing

Enhanced the visual spacing of all major headings on the homepage by applying increased line-height (leading-snug) to improve readability. This affects 11 headline elements across the landing page, including the main hero section, feature callouts, and solution descriptions, making text easier to scan and read.

JAN 23 NEW - Added Pulze Insights dashboard in Grafana for monitoring platform metrics

Added a new Grafana dashboard called “Pulze Insights” that provides monitoring and analytics for the platform. The dashboard displays key metrics including the number of organizations, monthly active applications (apps with at least one request in the last 30 days), and request volumes over the last 30 days. The dashboard connects to the API database and excludes Model Monitor applications from metrics calculations.

JAN 23 IMPROVEMENT - Updated homepage call-to-action button from 'Try the Playground' to 'Try for free'

The primary call-to-action button on the homepage now displays ‘Try for free’ instead of ‘Try the Playground’, and directs users to the signup page rather than a separate playground URL. This change provides clearer messaging about accessing the platform and streamlines the onboarding experience by taking users directly to account creation.

JAN 19 FIX - Re-enable OpenAI-compatible error fields and improve error handling

Fixed error response format to better match OpenAI’s API specification by re-enabling detailed error fields (code, type, message, param). Also improved error handling by using more specific InvalidRequestError instead of generic APIError for cases like deprecated models, unavailable models, and invalid prompt IDs.

JAN 19 FIX - Updated expired Slack community invite link

Fixed the Community Slack invitation link on the homepage that had expired. Users can now successfully join the Pulze AI Community Slack workspace using the new invite link to connect with other community members, share insights, and collaborate.

JAN 18 IMPROVEMENT - Improved Flowz validation and app creation response format

Enhanced Flowz validation by checking diagrams for recursion before creation, preventing potential infinite loops. Also standardized app creation response format by changing ‘app_id’ field to ‘id’ in API responses. The validation now happens earlier in the process, providing better error messages when diagram configurations would cause recursive loops.

JAN 18 IMPROVEMENT - Added max token control and debug mode for Replicate API models

Enhanced Replicate API integration with better token control by adding both max_length and max_new_tokens parameters. This improves token limit handling and compatibility across different Replicate models. Also added debug mode support controlled by application settings.

JAN 18 IMPROVEMENT - Enhanced Flowz validation and prompt handling

Updated Flowz validation endpoint to validate specific app configurations via a new ‘/validate/for-app/’ endpoint, replacing the previous flowz_id validation method. Improved error handling and validation for prompts across the application, with more consistent verification of prompt ownership and access permissions.

JAN 18 FIX - Fixed typo in subscription acceptance text

Corrected a spelling error in the subscription management interface where “accpt” was misspelled. The text now correctly reads “I understand and accept that all my remaining free credits will be voided” when users are reviewing subscription terms.

JAN 18 IMPROVEMENT - Clarified that trial credits will be voided when starting a subscription

Updated billing UI to explicitly state that all remaining free trial credits will be voided when subscribing to a paid plan. The subscription confirmation dialog now displays the current trial credit balance and requires users to acknowledge that these credits will not transfer to the paid subscription. Added helper text explaining that trial credits expire and cannot be carried over.

JAN 17 IMPROVEMENT - Added prompt visibility in Slack responses

Slack responses now display the original prompt text above the response, providing better context and conversation clarity for users. The prompt appears as a context block with plain text formatting in the Slack message thread.

JAN 17 TECHNICAL - Enhanced app usage tracking and improved app metadata documentation

Modified app usage tracking to make the ‘since’ parameter optional when querying request history, allowing for more flexible usage reporting. Enhanced app schema documentation by adding detailed field descriptions for API keys, organization relationships, and app settings. These changes improve the developer experience and data tracking capabilities.

JAN 17 NEW - Added AskPulze Slack Integration for AI Queries

Introduced a new Slack integration that allows users to query Pulze AI directly from Slack using the /askpulze command. Responses are displayed in-channel with rich formatting, including the model used and response latency. The integration features secure request validation, asynchronous processing, and includes Pulze branding in responses with contextual metadata.

JAN 17 IMPROVEMENT - Enhanced Trial tier with unlimited apps and personalized support

Updated the Trial subscription tier to include unlimited applications (previously displayed as ‘Unlimited’, now ’∞ Unlimited’) and upgraded support level from ‘Community’ to ‘Personalized Support’ with customer success access. Also standardized support level naming across tiers, with the Startup tier now showing ‘Community Support’ instead of ‘Community support’.

JAN 15 FIX - Fixed disabled button visual style to remove border

Fixed the visual appearance of disabled and loading buttons by removing the border. Previously, disabled buttons displayed with an unintended border that made them appear inconsistent with the intended design. Disabled buttons now correctly show with no border, gray background (pulze-200), and gray text (pulze-500).

JAN 13 FIX - Fixed promocode validation to handle whitespace in input

Fixed an issue where promocodes with leading or trailing whitespace would fail to apply correctly. The system now automatically trims whitespace from promocode inputs before processing them with the billing adapter, ensuring more reliable coupon redemption.

JAN 12 IMPROVEMENT - Enhanced Flowz validation with better error reporting and recursion detection

Improved the Flowz validation system to provide more detailed feedback when validation fails, particularly for recursive flows. Users now receive specific information about which app caused validation failures in recursive scenarios, and a new /validate endpoint allows checking Flowz validity without making changes. The update also adds safeguards to prevent invalid Flowz updates by restoring previous valid states automatically.

JAN 12 MODEL - Remove obsolete models from pricing table and app settings

Modified the model filtering system to automatically hide deprecated models from the pricing table and app settings once they reach their deprecation date. Previously deprecated models remained visible in these interfaces even after their end-of-life date. This change helps prevent users from selecting models that are no longer available.

JAN 12 NEW - Added support for percentage-based promotional discounts

Enhanced promotional code system to support percentage-based discounts in addition to fixed amount discounts. When applying promotional codes, the system now correctly handles both types of discounts, with percentage discounts being automatically calculated based on subscription price. The pricing table UI now displays discounted prices when a percentage-based promotion is active.

JAN 12 IMPROVEMENT - Updated call-to-action button text for consistency across homepage

Standardized call-to-action button labels throughout the homepage to use ‘Try for free’ consistently. Also corrected the features table label from ‘Increase time to market’ to ‘Decrease time to market’ to accurately reflect the benefit of faster development with model testing capabilities.

JAN 11 FIX - Fixed subscription discount calculation error when no discount is applied

Fixed an issue where the billing system would error when trying to calculate discounted subscription prices for users without an active discount. The system now properly checks if a discount exists before attempting to apply percentage-based price reductions.

JAN 11 IMPROVEMENT - Increased trial subscription token limit to 1 billion tokens

Trial subscriptions now come with a significantly increased token limit of 1 billion tokens, up from the previous 50 million tokens. This 20x increase gives trial users substantially more capacity to test and evaluate the platform while maintaining unlimited app creation capabilities.

JAN 11 FIX - Fixed chat completion handling in Flowz Engine and improved iteration feedback

Fixed an issue where chat completion messages weren’t being properly handled in the Flowz engine - now correctly processes the message object from response choices. Also improved iteration feedback by adding proper logging and clearer error messages when flow execution hits the maximum iteration limit. The engine now properly maintains conversation context between iterations.

JAN 11 FIX - Fixed discount indicator showing when no actual discount exists

Fixed the subscription pricing display to only show the strikethrough original price when a discount is actually applied (discounted price is less than original price). Previously, the strikethrough indicator could appear even when the discounted price equaled the original price, incorrectly suggesting a discount was available.

JAN 11 IMPROVEMENT - Improved subscription plans layout with responsive grid design

Redesigned the subscription plans modal to use a responsive grid layout that adapts from 1 column on mobile to 3 columns on desktop (or 4-5 columns for wider screens depending on plan count). Reduced modal padding from 14 to 7 units for better space utilization, removed fixed width constraints on individual plan cards, and adjusted modal minimum width to 1024px on extra-large screens for optimal viewing.

JAN 09 MODEL - Deprecated legacy OpenAI models are now hidden from documentation

Several legacy OpenAI models have been marked as deprecated as of January 4, 2024: text-ada-001, text-babbage-001, text-curie-001, text-davinci-002, and text-davinci-003. These models will no longer appear in the documentation or be available by default for new applications. Existing applications using these models should migrate to newer alternatives.

JAN 08 NEW - Added Flowz workflow engine for visual flow-based application logic

Introduced Flowz, a new visual workflow system that allows users to create, manage, and execute flow-based logic diagrams for their applications. Users can now create workflow diagrams with nodes and connections, attach them to apps, and the Flowz engine will validate and execute the defined logic. This includes new API endpoints for creating and retrieving Flowz (/flowz), database tables for storing flow diagrams as JSON, and an execution engine that processes the visual workflows.

JAN 08 IMPROVEMENT - Updated authentication method label from 'Password' to 'Email and Password'

Changed the display label for password-based authentication from ‘Password’ to ‘Email and Password’ to provide clearer information about the authentication method. This affects how the login type is shown in welcome emails and user profile information. The change also includes internal improvements to ensure user and organization data is properly loaded when sending verification emails to new users.

JAN 05 IMPROVEMENT - Prevent adding users to organizations they already belong to

Added validation to prevent users from being added to an organization they are already a member of through the internal tool. The system now checks membership status before processing the add request and returns a clear error message “You are already part of this organization” if the user is already a member, avoiding duplicate memberships and potential data inconsistencies.

JAN 04 IMPROVEMENT - Improved welcome email control with selective sending on development

Welcome emails can now be selectively controlled in development environments. Emails will only be sent if the feature is explicitly enabled or if the user’s email contains ‘welcome’ as a keyword, making it easier to test the email flow without spamming real addresses. The configuration has been renamed from EMAILS_ENABLED to WELCOME_EMAILS_ENABLED for better clarity.

JAN 04 FIX - Fixed security vulnerability in organization permission management

Fixed a security issue where users could grant other members permissions they themselves didn’t have. Now when updating another user’s permissions, you can only add or remove permissions that you currently possess. For example, if you only have ‘editor:all’ access, you cannot grant or revoke ‘admin:all’ permissions from other users. The system now sanitizes permission changes to prevent privilege escalation attacks.

JAN 04 IMPROVEMENT - Removed billing configuration validation requirement

Removed the strict billing validation check that previously blocked operations when organization billing details were incomplete. Users will no longer encounter the ‘You don’t have complete billing details’ error (HTTP 417) when Stripe ID, billing email, postal code, or country information is missing from their organization settings.

JAN 03 FIX - Fixed database migration downgrade command in documentation

Corrected the Alembic downgrade command in the migration README from ‘alembic downgrade base’ to ‘alembic downgrade -1’. The updated command now properly rolls back the database to the previous migration instead of reverting all the way to the initial base state, which is the correct approach for testing migration rollbacks.

JAN 02 IMPROVEMENT - Automatically capture billing address when adding first payment method

When adding a payment method, the system now automatically stores billing details (postal code and country) from the payment method if no address information is currently on file. This eliminates the need for separate billing configuration steps and removes billing verification requirements from multiple endpoints, streamlining the payment setup process for new organizations.

JAN 02 IMPROVEMENT - Added domain validation for billing email in organization setup

Enhanced the organization setup process to validate the domain of the billing email address. When updating organization settings, the system now extracts and verifies the domain from the billing email (the part after ’@’) to ensure it’s valid before saving changes.

JAN 02 IMPROVEMENT - Enhanced parameter support across AI providers

Improved parameter handling across multiple AI providers. Added support for logit_bias parameter in AI21 Labs, and added support for presence_penalty, frequency_penalty, n, logit_bias, top_p, and stop parameters in GooseAI. Added comprehensive documentation comments indicating which parameters are supported or not supported for each provider (Anthropic, Cohere, MistralAI, MosaicML), improving API consistency and transparency.

JAN 02 IMPROVEMENT - Allow null values for feedback text in request ratings

The feedback field in request ratings now accepts null values in addition to text strings. Previously, feedback required a string value (defaulting to empty string), but now it can be explicitly set to null when no feedback text is provided alongside a rating.

December 2023

DEC 21 NEW - Added 'Try for free' button labels for subscription tiers on homepage

Subscription tier cards on the homepage now display ‘Try for free’ button labels for non-authenticated users or users without organization access. This applies to the Startup, Growth, and Scale tiers for both monthly and yearly subscription options, making it clearer to new users that they can try these plans without immediate commitment.

DEC 21 IMPROVEMENT - Welcome email now displays user's login method (Google, Github, or Password)

DEC 21 IMPROVEMENT - Environment-specific branding with 'Pulze', 'Pulze (DEV)', and 'Pulze (LOCAL)'

Updated the application name displayed across the platform to show environment-specific branding. Production displays ‘Pulze’, development shows ‘Pulze (DEV)’, and local environments show ‘Pulze (LOCAL)’. This affects email subjects, email footers, and other user-facing areas where the company name appears, making it easier to identify which environment you’re working in.

DEC 21 FIX - Fixed model scoring when all models receive a score of 0

Fixed an edge case where the system would throw an error when all models in a comparison received a score of 0. The normalization function now correctly handles this scenario by preserving the original zero scores instead of failing with a 417 error, allowing the evaluation to complete successfully.

DEC 20 NEW - Added MistralAI provider support to cloud deployment

Added MistralAI as a new AI provider with support for three API keys configured with 50 requests per minute (RPM) each using least connection load balancing mode. Users can now access MistralAI models through the Pulze API platform alongside existing providers like OpenAI, Anthropic, and others.

DEC 20 NEW - Added endpoint to retrieve available permissions

Added a new /permissions endpoint that returns a list of all available permissions in the system. Internal users (belonging to Pulze Seed organization) will also see internal-only permissions in their results. This allows clients to dynamically discover what permissions are available for role and access control management.

DEC 20 IMPROVEMENT - Welcome emails now use email address as fallback when first name is missing

Welcome emails now gracefully handle cases where users don’t have a first name set by using their email address as a fallback. Previously, if a user’s first_name field was empty, the welcome email would display nothing in the greeting. Now it will display the user’s email address instead, ensuring a more personalized experience even when full profile information isn’t available.

DEC 20 NEW - Added model deprecation warnings with automatic filtering of deprecated models

Introduced automatic deprecation handling for AI models with date-based filtering. When a model has a scheduled deprecation date, it will automatically be excluded from available models for requests once that date passes. Users will see warning messages for models approaching deprecation in the format ‘This model will be deprecated on YYYY/MM/DD. We recommend you disable it for your app before the deprecation date to avoid failed requests.’ This helps prevent failed requests by proactively filtering out deprecated models from the selection pool.

DEC 19 MODEL - Added MistralAI provider with three models

Added support for MistralAI as a new provider with three models: Mistral Tiny (mistral-tiny) powered by Mistral-7B-v0.2, Mistral Small (mistral-small) powered by Mixtral-8X7B-v0.1 with 12B active parameters, and Mistral Medium (mistral-medium). All models support 32K context windows and are enabled for chat completions with streaming support.

DEC 15 FIX - Fixed model feature validation and made max_tokens optional for providers

Fixed the verify_model_features() function to correctly use policies from either app settings or request headers when validating unsupported features. Made max_tokens parameter optional for AlephAlpha and Replicate providers instead of returning errors when not provided. Added ignore_unsupported_features policy to request labels for better tracking.

DEC 15 FIX - Fixed async/await handling in organization usage tracking

Resolved an issue where the organization request usage tracking function was incorrectly being awaited as an asynchronous operation when it was actually a synchronous function. This fix prevents potential runtime errors and ensures that usage data is properly recorded to the organization table after each API request.

DEC 14 NEW - Add response_format parameter for JSON mode support

Added support for the response_format parameter in completion requests, allowing users to specify whether the model should output text or structured JSON objects. When using JSON mode (type: “json_object”), the API will validate that the model supports JSON output and will return an error if attempted with incompatible models. This follows OpenAI’s API specification requiring “JSON” to appear in the prompt context when JSON mode is enabled.

DEC 12 FIX - Fixed validation error when n or best_of parameters are None

Fixed a bug in model validation that would incorrectly throw an error when the ‘n’ or ‘best_of’ parameters were set to None. The validation now properly checks if these parameters exist and are greater than 1 before rejecting requests to models that don’t support these parameters, preventing false validation failures.

DEC 12 NEW - Added subscription tier system with trial periods and usage tracking

Introduced a comprehensive subscription management system with support for trial periods (21 days default), subscription tiers (SCALE, GROWTH, ENTERPRISE), and billing cycles (monthly/yearly). Organizations now have automatic trial tracking, subscription pause/cancellation reasons, and enhanced usage views for token and cost monitoring across applications and organizations. This enables better billing transparency and subscription lifecycle management.

DEC 12 NEW - Added ignore_unsupported_features policy to handle model feature compatibility

Introduced a new policy ‘ignore_unsupported_features’ (defaults to true) that controls how the system handles unsupported model features. When enabled, requests using unsupported features like frequency_penalty, presence_penalty, n, or best_of parameters are processed normally by ignoring the unsupported parameters. When disabled, requests will fail with a FEATURE_NOT_SUPPORTED_BY_MODEL error if the target model doesn’t support the requested features. This provides users with flexibility to either enforce strict feature compatibility checking or allow graceful degradation when using models with limited capabilities.

DEC 11 FIX - Fixed API error handling for single-item prompt arrays

Fixed the API to properly accept prompts formatted as single-item arrays (e.g., [“Say Hello”]) without throwing an error. The error status code for multiple prompts has been changed from 400 Bad Request to 422 Unprocessable Entity to better reflect the validation error. Multiple prompts in an array still correctly return an error as this feature remains unsupported.

DEC 08 FIX - Fixed error when re-inviting existing organization members

Fixed an issue where attempting to invite a user who is already a member of the organization would cause an error. The system now checks if the email address belongs to an existing organization member before creating an invitation, and returns a clear error message. This prevents duplicate invitations and improves the user experience when managing organization members.

DEC 08 NEW - Added ability to create models with custom prompts

Users can now create custom model variants with pre-configured prompts that automatically wrap user inputs. When creating a model with a prompt, the system establishes a parent-child relationship between the base model and the custom variant, allowing the prompt to be applied automatically before the request is sent to the provider. This enables teams to standardize prompt templates across their organization without requiring users to manually include them in each request.

DEC 07 IMPROVEMENT - Added gcloud CLI prerequisite and Docker authentication instructions

Updated documentation to clarify that gcloud CLI is a required prerequisite for local development. Added instructions for configuring Docker authentication to access Google Cloud artifact registries using the command gcloud auth configure-docker us-west1-docker.pkg.dev, which is necessary for pulling required container images.

DEC 06 NEW - Added ability to create custom models with specific prompts

Users can now create custom model configurations that are linked to specific prompts, allowing for reusable model-prompt combinations. These custom models are organization-specific and can be associated with apps. A new endpoint POST /models/with-prompt enables creating these configurations, and DELETE /models//with-prompt allows removing them. The system now distinguishes between base model settings and prompt-based model settings when retrieving app configurations.

DEC 04 IMPROVEMENT - Upgraded Milvus vector database from version 2.3.0 to 2.3.3

Updated the Milvus vector database infrastructure by upgrading the Milvus Operator helm chart from v0.8.0 to v0.8.6, which targets Milvus 2.3.3. This upgrade brings performance improvements, bug fixes, and enhanced stability to the vector database layer used for semantic search and retrieval operations.

DEC 04 FIX - Fixed prompt modification timestamp not updating when editing prompts

Fixed an issue where the ‘modified_on’ timestamp was not being updated when users edited their prompts. Now when you update a prompt’s content, title, or description, the modification timestamp is correctly set to the current date and time, ensuring accurate tracking of when prompts were last changed.

November 2023

NOV 24 MODEL - Added Claude 2.0 and Claude 2.1 models with updated pricing and capabilities

Added two new Anthropic Claude models: Claude 2.0 with 100K context window and Claude 2.1 with 200K context window featuring reduced hallucination rates. Updated the claude-2 alias to point to the latest claude-2.x model with 200K context window. All models now have improved pricing at

8 per million prompt tokens and

24 per million completion tokens (previously $15 for both).

NOV 24 NEW - Added platform feedback submission endpoint

Users can now submit feedback about how they discovered the platform through a new feedback endpoint. This information is automatically synced to the user’s HubSpot contact profile in the ‘source_details’ field, enabling better understanding of user acquisition channels.

NOV 22 FIX - Fixed crash when encoding special tokens in prompts

Resolved an issue where the token estimation function would crash when encountering special tokens in prompt text. The encoder now correctly handles all special tokens by treating them as actual special tokens rather than regular text, preventing crashes during token counting for model selection and cost estimation.

NOV 22 IMPROVEMENT - Increased default rate limit from 60 to 200 requests per minute

The default rate limit for API requests has been increased from 60 to 200 requests per minute for both organizations and applications. This change provides more headroom for API usage and reduces the likelihood of hitting rate limit errors during normal operations.

NOV 22 FIX - Fixed error when accepting invitation to organization user already belongs to

Fixed an issue where users would encounter an incorrect error response (401 Unauthorized) when attempting to accept an invitation to an organization they already belong to. The system now properly validates organization membership before checking invitation details and returns the correct error status (409 Conflict) with a clear message indicating the user already belongs to the organization.

NOV 21 NEW - Added custom labels support for Loki logs exports

Logs exported to Grafana Loki can now include custom labels from the response metadata. The exporter automatically extracts any labels defined in the response’s metadata.labels field and adds them as stream labels in Loki, making it easier to filter and query logs based on custom attributes like ‘unit’, ‘environment’, or other user-defined labels.

NOV 21 FIX - Fixed typo in frequency_penalty parameter for Cohere provider

Corrected a typo where ‘frequence_penalty’ was misspelled as ‘frequence_penalty’ instead of ‘frequency_penalty’ in the Cohere provider implementation. This fix ensures the frequency_penalty parameter is properly passed to Cohere API calls, allowing users to correctly control the penalty applied to frequently used tokens in generated responses.

NOV 21 NEW - Added configurable rate limiting per organization and application

Organizations and applications now have individual rate limit settings stored in the database, defaulting to 60 requests per minute. This replaces the previous hardcoded limit of 9,500 requests per minute and allows for customized rate limiting on a per-org and per-app basis. Rate limits are enforced through Redis and can be configured independently for each organization and application.

NOV 21 FIX - Fixed crash when accessing metrics with None labels

Fixed an issue where the metrics system would crash when trying to process items with None labels. The label validation and stringification functions now properly handle None values by returning empty dictionaries, preventing errors when custom labels are not provided.

NOV 20 FIX - Fixed prompt validation to support multiline templates with prompt placeholder

Fixed an issue where prompt templates with the prompt placeholder on non-first lines were incorrectly rejected during validation. The regex pattern now supports multiline prompts by adding the DOTALL flag ((?s)), allowing the placeholder to appear anywhere in the template including after newlines and multiple lines of text.

NOV 20 NEW - Added CORS support for external domain access to API endpoints

Enabled Cross-Origin Resource Sharing (CORS) for external domains to access specific API endpoints including completions, models management, apps configuration, and logs. This allows web applications hosted on external domains to make direct API calls to endpoints like /completions, /models/rank, /models/active, /logs, and others without CORS restrictions, while maintaining security controls for non-API routes.

NOV 20 FIX - Fixed parameter typo in Aleph Alpha provider integration

Corrected a parameter name typo in the Aleph Alpha provider where ‘frequence_penalty’ was misspelled and has been fixed to ‘frequency_penalty’. This ensures the frequency penalty parameter is properly passed to the Aleph Alpha API, allowing users to correctly control repetition in model responses when using Aleph Alpha models.

NOV 17 MODEL - Added GPT-4 Turbo (gpt-4-1106-preview) with 128K context window

Added support for OpenAI’s GPT-4 Turbo (gpt-4-1106-preview) model with 128K token context window, priced at

0.01 per 1K prompt tokens and

0.03 per 1K completion tokens. This model offers significantly larger context windows compared to previous GPT-4 versions. Also updated the description for Llama 2 70B Chat to be more accurate.

NOV 16 IMPROVEMENT - Removed domain DNS validation requirement during email signup

Signup validation no longer rejects email addresses based on DNS pingability checks. Previously, the system would reject emails if their domain couldn’t be resolved via DNS lookup (socket.gethostbyname), which could falsely reject valid domains experiencing temporary DNS issues. Now only temporary/disposable email domain checks remain, allowing legitimate users with valid but temporarily unreachable domains to sign up successfully.

NOV 15 FIX - Fixed organization user deletion endpoint and improved permission handling

Fixed a critical bug in the delete user from organization endpoint that was incorrectly setting user status to inactive before deletion, causing the operation to fail. Added comprehensive test coverage for user permission updates and user deletion operations, including verification that users can properly update and remove other users based on their role permissions (admin, editor, viewer).

NOV 14 MODEL - Added GPT-4 Turbo (gpt-4-1106-preview) with 128K context window

Added support for GPT-4 Turbo (gpt-4-1106-preview) with 128K context window, improved instruction following, JSON mode, reproducible outputs, and parallel function calling. Maximum output tokens: 4,096. Additionally, implemented per-model parameter support configuration, enabling models to declare capabilities like function calling, streaming, JSON output, frequency/presence penalties, and n-parameter support, ensuring API requests only use parameters supported by each specific model.

NOV 14 IMPROVEMENT - Made max_tokens parameter optional with intelligent defaults

The max_tokens parameter is now optional (defaults to None instead of 16) and will be automatically set to appropriate values based on model requirements. When a model requires max_tokens but none is provided, the system uses a default of 16 tokens. This change also improves latency metrics by calculating per-token latency based on actual response tokens rather than the requested max_tokens limit, providing more accurate performance measurements.

NOV 13 NEW - Add ability to create models via UI with chat model type specification

Added the ability to create and manage AI models through the user interface. Models can now be designated as chat-type models (a new ‘is_chat’ field was added), with existing OpenAI models (GPT-4, GPT-3.5-Turbo, and GPT-4-32K) automatically marked as chat models. The model creation system now properly tracks which user added each model using their Auth0 ID, and includes improved error handling for duplicate error codes.

NOV 13 NEW - Added function calling support for OpenAI chat completions

OpenAI chat completion requests now support function calling with tools and tool_choice parameters. Users can define functions that the model can call during conversations, enabling structured outputs and interactive workflows. The implementation includes support for tool definitions with parameters, function call arguments, and tool choice strategies (auto, none, or specific function selection).

NOV 13 NEW - Added custom labels support to Prometheus metrics

Prometheus metrics now support custom labels that can be passed through request metadata, allowing users to add their own key-value pairs for better metric filtering and organization. Custom label keys are automatically validated and sanitized to meet Prometheus naming requirements (alphanumeric and underscores only), with values converted to strings. This applies to all three metric types: model latency gauges, app cost gauges, and app usage gauges.

NOV 10 FIX - Fixed label/value endpoint filtering and query logic

Fixed the logs filtering endpoint to properly handle optional date_to and app_ids parameters. Previously, these filters were incorrectly applied even when not provided, causing queries to fail or return incorrect results. The labels endpoint now correctly retrieves label keys and their associated values for filtered log searches.

NOV 09 FIX - Fixed LlamaIndex integration OpenAI format compatibility issues

Fixed multiple issues with LlamaIndex integration when using OpenAI-compatible format: corrected response format to include required fields (index, model, object type, id, usage), fixed prompt_id validation bug that was using wrong variable (app_update.prompt_id instead of prompt_id parameter), removed duplicate metadata fields (provider, namespaced_model), and standardized model/provider references to use namespace format. These fixes ensure LlamaIndex-based RAG queries work correctly with OpenAI SDK clients.

NOV 09 NEW - Added subscription billing with Stripe integration (test mode)

Introduced subscription billing capabilities through Stripe integration, currently available in test mode only. The system now tracks token-based pricing for AI models, with separate costs for prompt and completion tokens stored directly in the database. Added subscription management fields to organizations including subscription IDs and end dates, enabling metered billing for model usage.

NOV 08 NEW - Add filtering options for prompts list by visibility scope

Added a new ‘show’ parameter to the prompts list endpoint that allows filtering prompts by visibility scope. Users can now filter to view only public prompts (‘public’), organization-specific prompts (‘org’), or all prompts (‘all’, default). The public filter also supports an optional ‘include_for_review’ flag to include prompts that are published but pending review or approval.

NOV 08 IMPROVEMENT - Improved prompt ID assignment for app updates

Enhanced the app update functionality to automatically use the prompt ID from the policies object when no top-level prompt ID is provided. The system now checks both the direct prompt_id field and the policies.prompt_id field, ensuring the prompt ID is properly applied to the app configuration even when specified only in policies.

NOV 08 NEW - Added prompt review and publication workflow with approval system

Introduced a comprehensive prompt review system that allows prompts to be submitted for publication, reviewed, and approved or declined with reasons. Users can now request to make their prompts public, administrators can review and approve/decline submissions with timestamps tracking published_on, reviewed_on, and approved_on dates, and prompts are now organization-scoped with enforcement preventing users from editing or deleting prompts they didn’t create. The system also tracks decline reasons when prompts are rejected during review.

NOV 07 FIX - Fixed prompt validation and deletion with better error handling

Fixed multiple issues with prompt management: Apps now validate that assigned prompts exist and belong to the correct organization before saving. Prompt retrieval and updates now return consistent error messages with proper error codes (INVALID_PROMPT_ID) instead of generic 404 errors. Prompt deletion now properly checks for associated apps and prevents deletion if the prompt is in use, returning specific app IDs that need to be updated first.

NOV 06 FIX - Fixed error handling for unauthorized access to internal Pulze resources

Fixed the error response when non-Pulze employees attempt to access internal resources. The system now returns a proper ‘ET_INTERNAL’ error code with detailed messaging (‘This resource is only accessible for Pulze’s internal admins’) and includes the organization name in the error details for better debugging. Previously used a generic 401 unauthorized error without proper error categorization.

NOV 06 IMPROVEMENT - Standardized prompts API endpoints to follow REST conventions

Updated prompts API endpoints to use standard REST conventions. The create prompt endpoint changed from POST /prompts/create to POST /prompts/, and the update prompt endpoint changed from PUT /prompts/update to PUT /prompts/. This change makes the API more consistent with RESTful design patterns while maintaining the same functionality.

NOV 03 NEW - Added delete prompt endpoint with app dependency validation

Added the ability to delete prompts through a new DELETE endpoint. The system now prevents deletion of prompts that are actively being used by any apps, returning error code ET_0007 with a list of affected app IDs. When a prompt is successfully deleted, all associated apps are cleaned up appropriately with soft-deletion for apps with request history.

NOV 03 NEW - Added custom prompt templates with dynamic prompt replacement

Introduced support for custom prompt templates that can be associated with apps and applied to requests. Users can now create prompts with a prompt placeholder that dynamically wraps user queries, enabling consistent prompt engineering across requests. The prompt can be set at the app level (prompt_id field) or overridden per-request via policies, allowing flexible prompt management for different use cases.

NOV 02 IMPROVEMENT - Increased rate limits to 9,500 requests per minute for OpenAI Tier 4

Significantly increased API rate limits from 50 to 9,500 requests per minute per app and from 150 to 9,500 requests per minute per organization to align with OpenAI’s 10,000 RPM Tier 4 limits. This allows for much higher throughput when making requests through the API, reducing rate limit errors for heavy usage scenarios.

NOV 02 NEW - Added ability to regenerate API keys for apps via UI

Users can now regenerate API keys for their apps directly through the UI using a new endpoint (POST //regenerate-key). When regenerated, the app receives a new API key with the ‘sk-’ prefix while maintaining the same app configuration. This feature requires editor-level permissions and includes improved error handling for invalid API keys.

NOV 02 IMPROVEMENT - Filter models by context window requirements for optimal selection

The engine now calculates the required context window based on prompt length and max_tokens, then only considers models that can accommodate the request. Uses tiktoken’s cl100k_base encoding to estimate token count and filters candidates accordingly. This prevents selection of models with insufficient context windows and provides clearer error messages when no suitable models are available.

NOV 01 NEW - Added prompt management system with CRUD operations and token calculation

Introduced a new prompt management feature that allows users to create, retrieve, update, and list prompts within their organization. Users can now store reusable prompts with titles and descriptions, automatically calculate token counts for prompts, and link prompts to applications. The system includes role-based permissions (viewer, editor, admin) for prompt operations and provides a dedicated API endpoint at /prompts for managing prompt templates.

October 2023

OCT 31 FIX - Fixed dashboard query returning empty results for active apps

Resolved an issue where dashboard queries were incorrectly filtering active apps. The bug was caused by comparing ‘is_active is True’ which would fail when is_active was NULL, now fixed to properly check ‘is_active’ as a boolean condition. This ensures all active apps are properly included in dashboard analytics and results.

OCT 30 IMPROVEMENT - Added support for additional OpenAI completion parameters

Added support for advanced OpenAI completion parameters including n (number of completions), logit_bias, presence_penalty, frequency_penalty, top_p, stop sequences, and best_of. The balance verification system now accounts for multiple generations per request (when n > 1 or best_of > 1), providing more accurate cost predictions and preventing requests that would exceed available balance.

OCT 30 IMPROVEMENT - Standardized API response format to match OpenAI structure

Transformed request response JSON structure to follow OpenAI API standards. Response fields like ‘id’, ‘usage’, ‘object’, and ‘model’ are now at the root level instead of nested under ‘metadata’, making the API more compatible with OpenAI client libraries and tools. A database migration automatically converts existing request logs to the new format.

OCT 28 IMPROVEMENT - Model scoring now applied even when requesting a single model candidate

Improved the model selection logic to ensure models are properly scored and ranked even when only one model candidate is requested. This change ensures consistent scoring behavior across all requests, providing better model performance metrics in the response metadata regardless of the number of candidates. The API response format has also been optimized to exclude null fields for cleaner output.

OCT 26 IMPROVEMENT - Added support for LangChain prompt-as-array format in completion requests

The completion API now accepts prompts as either a string or an array format, enabling compatibility with LangChain’s prompt formatting. When a single-element array is provided, it is automatically converted to a string. Multi-element arrays are rejected with a clear error message indicating only one prompt value is supported.

OCT 24 NEW - Added ability to edit AI models through Internal Page UI

Internal administrators can now view and edit AI model configurations directly through the UI. This includes updating model properties such as provider, model name, owner, version (@at), GDPR compliance status, open-source status, default active state, public visibility, context window size, URL, and description. Changes to model identifiers automatically regenerate the namespace to maintain consistency across the system.

OCT 24 NEW - Add ability to remove Prometheus and Loki integrations from organizations

Organizations can now fully remove their Prometheus (prom) and Loki monitoring integrations by clearing the configuration. When an integration update is sent without Prometheus or Loki settings, the system will now properly clear all related fields (endpoint, id, and token) instead of leaving the previous configuration in place. This allows for complete integration removal rather than only supporting updates.

OCT 23 IMPROVEMENT - Removed verification link from email verification API response

Enhanced security by removing the email verification link from the API response when requesting a new verification email. The verification link is now only sent via email and no longer exposed in the API response payload, reducing the risk of link exposure through API logs or client-side code.

OCT 23 IMPROVEMENT - Enhanced email validation with domain verification

Email validation now checks if domains are actually valid and reachable, not just whether they’re temporary. When inviting users to organizations or validating email addresses, the system now verifies that the domain exists using DNS resolution, providing clearer error messages like ‘Domain @example is invalid’ for non-existent domains and ‘Domains from @example are not allowed’ for temporary email providers.

OCT 20 TECHNICAL - Migrated model configuration from code to database table

Restructured model management by moving all model configurations (including GPT-4, GPT-3.5, Claude, PaLM, Llama 2, and other models) from hardcoded definitions to a dedicated database table. This change enables dynamic model management and per-app model settings, allowing for more flexible model availability and configuration without requiring code deployments.

OCT 20 FIX - Fixed last name extraction for single-word names

Fixed an issue in the name guessing function where single-word names (names without spaces) would cause errors. Previously, the function attempted to access the second element of a split name array using index [1], which would fail for single-word names. Now uses pop() to correctly extract the last word, handling both single-word and multi-word names properly.

OCT 20 IMPROVEMENT - Renamed organization integrations endpoint and added validation

The organization integrations endpoint has been renamed from /integration to /integrations for better API consistency. Added field validation to ensure integration credentials (id, token, and endpoint) are not empty strings when configuring Prometheus and Loki integrations. This prevents configuration errors from invalid or missing integration parameters.

OCT 19 NEW - Added integration update endpoint for Prometheus and Loki configurations

Organizations can now configure external monitoring integrations through a new API endpoint. Administrators can set up Prometheus and Loki integrations by providing endpoint URLs, authentication tokens, and integration IDs. This enables organizations to connect their monitoring and logging infrastructure directly to the platform.

OCT 18 NEW - Added Grafana Cloud integration with Prometheus remote_write support

Organizations can now integrate with Grafana Cloud for metrics and logs export. The integration adds support for Prometheus remote_write protocol and Loki for log aggregation. New organization-level configuration fields allow setting Prometheus endpoints, IDs, and tokens, as well as Loki endpoints, IDs, and tokens for secure data export to Grafana Cloud monitoring services.

OCT 16 IMPROVEMENT - Ray clusters now automatically cleanup after job completion

Ray clusters are now automatically shut down and cleaned up immediately after jobs finish execution. This is configured with shutdownAfterJobFinishes enabled and ttlSecondsAfterFinished set to 0, ensuring resources are released promptly and reducing infrastructure costs for Ray-based workloads.

OCT 16 FIX - Fixed organization creation and HubSpot contact management

Fixed issues with organization creation to properly handle address fields including city, and improved HubSpot integration to prevent duplicate contacts by checking for existing contacts by email before creation. The system now correctly marks users as existing platform users in HubSpot and uses the pulze_name field instead of the generic name field for organization tracking.

OCT 14 FIX - Fixed max token limits for Llama 2 and CodeLlama models

Corrected the maximum token limits for several Llama models: CodeLlama-13b now supports up to 16,384 tokens (previously 2,048), and Llama-2-70b-chat now supports up to 4,096 tokens (previously 2,048). Additionally, updated token request limits for Claude models to 100,000 tokens (previously 4,096) in the knowledge graph seed data. These changes allow users to process longer inputs and outputs with these models.

OCT 14 FIX - Fixed issue preventing updates to organization address fields

Fixed a bug that prevented updating organization address information (address_1, address_2, address_city, address_zip, address_state, address_country) when submitting a full organization update. The endpoint now correctly accepts and processes all address fields regardless of the update type. Also relaxed validation to allow optional values for expense_synced_at and pending_expense fields.

OCT 13 IMPROVEMENT - Improved organization creation with validation and empty default values

Organization creation now requires display names and org names to be at least 4 characters long, with automatic whitespace trimming. New organizations no longer have auto-generated names or placeholder logos - instead they start with empty values, forcing users to set meaningful names through the UI. This ensures better data quality and more intentional organization naming.

OCT 13 FIX - Fixed security issue allowing access to logs from apps in other organizations

Fixed a critical security vulnerability in the logs endpoint where users could potentially access logs from applications belonging to other organizations. The system now properly verifies that all requested app IDs belong to the same organization and that the user has permission to access that organization’s data before returning any logs.

OCT 12 NEW - Organization setup wizard with mandatory display name configuration

New users are now required to complete organization setup after registration. Personal organizations are created with empty display names that must be filled in, and users can update organization details during initial setup without requiring editor permissions. The organization name format has been changed to include a timestamp (e.g., ‘org-2343252343423-’) to ensure uniqueness.

OCT 12 NEW - Added promo code/coupon support and 3D Secure card verification

OCT 12 NEW - Add support for alerting on Cloud Monitoring metrics

Added support for creating alerts based on Google Cloud Monitoring metrics. Includes a pre-configured alert for high backend latency that triggers when 99th percentile latency for HTTPS load balancer backends exceeds 10 seconds for 5 minutes. The monitoring system now supports both namespace-scoped Rules and cluster-wide GlobalRules for flexible alert configuration across different scopes.

OCT 11 IMPROVEMENT - Stripe payment processing switched from test mode to live production mode

Payment processing via Stripe has been upgraded from test credentials to live production credentials. All Stripe transactions will now process real payments instead of test payments. This enables the platform to accept actual customer payments and handle production payment workflows.

OCT 11 IMPROVEMENT - Enhanced model name support to include owner prefix (e.g., anthropic/claude)

Model names can now include an optional owner prefix, allowing more specific model identification like ‘anthropic/claude-2’ or ‘meta/llama-2’. The system now correctly parses and matches models with owner prefixes, ensuring proper model selection when the owner namespace is specified in requests.

OCT 07 MODEL - Added knowledge graph seed data with performance metrics for 54+ AI models

Added comprehensive knowledge graph seed data (dated 2023-10-07) containing performance metrics and pricing information for 54+ AI models across 8 providers including AI21 Labs (j2-ultra, j2-mid, j2-light with 8191 token limits), Aleph Alpha (luminous-supreme, luminous-supreme-control, luminous-base-control with 1990 token limits), and others. Each model includes category-specific performance scores across 20 different domains (Arts & Crafts, Technology & Gadgets, Business & Finance, etc.), pricing per token in USD, latency metrics, and availability status.

OCT 06 IMPROVEMENT - Improved prompt formatting for Mistral-7B-OpenOrca model on Replicate

Enhanced the Mistral-7B-OpenOrca model integration to automatically format prompts with the correct chat template markers when they are not already formatted. This ensures proper model behavior without requiring users to manually add template formatting to their prompts.

OCT 06 FIX - Fixed input format for mistral-7b-openorca model on Replicate

Fixed a compatibility issue with the mistral-7b-openorca model on Replicate provider. The model now correctly receives input using the ‘message’ parameter instead of ‘prompt’, allowing it to process requests properly. This change ensures the model works as expected without breaking other Replicate models.

OCT 06 FIX - Fixed model identifier capitalization for MosaicML Llama2 70B Chat

Corrected the model identifier from ‘mosaicml/llama2-70B-chat’ to ‘mosaicml/llama2-70b-chat’ by fixing the uppercase ‘B’ to lowercase ‘b’ in the 70B parameter designation. This ensures proper model naming consistency in the knowledge graph seed data and may resolve issues with model lookups that expect the correct lowercase identifier.

OCT 06 FIX - Fixed incorrect model path for Mistral 7B OpenOrca on Replicate

Corrected the model path for Mistral 7B OpenOrca on Replicate from ‘a16z-infra/mistral-7b-openorca’ to ‘nateraw/mistral-7b-openorca’. This fixes a copy error that would have prevented users from accessing this model with the correct repository path.

OCT 06 MODEL - Added Mistral-7B-OpenOrca model via Replicate

Added support for Mistral-7B-OpenOrca (nateraw/mistral-7b-openorca), a fine-tuned version of Mistral-7B-v0.1 trained on the OpenOrca dataset. This model is available through the Replicate provider with a 4K token context window and costs $0.000045 per token. Also updated the Mistral-7B-Instruct-v0.1 model configuration to increase its max token context from 2K to 4K tokens and improved its description.

OCT 06 MODEL - Updated token context limits for MosaicML models

Increased maximum token context for MosaicML models to match their actual capabilities. The mpt-30b-instruct model now supports 8,192 tokens (up from 2,048), and the llama2-70b-chat model now supports 4,096 tokens (up from 2,048). Users can now generate longer completions and work with larger contexts when using these models.

OCT 06 MODEL - Added MosaicML MPT-30B-Instruct and Llama2-70B-Chat models

Added two new MosaicML models available for scoring: MPT-30B-Instruct (30B parameters, 8,192-token context length, trained on datasets including Databricks Dolly-15k, HH-RLHF, CompetitionMath, and others) and Llama2-70B-Chat (70B parameters, 4,096-token context length, Meta’s dialog-optimized model trained on 2T tokens with 1M+ human annotations). Also updated the MPT-7B-Instruct model description to reflect it as a 6.7B parameter instruction-finetuned model and corrected its pricing from

0.0000005 to

0.00000005 per token.

OCT 06 MODEL - Added MosaicML as a supported AI vendor with API key management

Added support for MosaicML as a new AI model provider with automatic load balancing across three API keys using least connection mode. Each key is configured with a rate limit of 3,500 requests per minute (RPM) for optimal throughput and reliability.

OCT 06 NEW - Added email verification status display and manual verification request

Users can now see whether their email address has been verified in their account settings. Added a new endpoint (/verify-email/request) that allows users to manually request a new verification email if their email is not yet verified. The system now tracks email verification status in user profiles and prevents sending duplicate verification emails to already-verified addresses.

OCT 06 MODEL - Added Mistral-7B-Instruct-v0.1 model via Replicate

Added support for Mistral-7B-Instruct-v0.1, a 7-billion-parameter language model available through Replicate (a16z-infra/mistral-7b-instruct-v0.1). The model has an estimated 2048 token context window and uses the Dolly tokenizer as an approximation for token counting.

OCT 06 FIX - Fixed app update validation to prevent empty descriptions

App descriptions are now required to have at least 1 character when updating an app. Previously, empty descriptions were incorrectly accepted, which could result in apps without proper identification. The API now returns a 422 Unprocessable Entity error when attempting to update an app with an empty description.

OCT 03 IMPROVEMENT - Enhanced user profile update security by retrieving auth0_id from token

Improved security for user profile updates by retrieving the auth0_id from the authentication token instead of the request payload. This prevents users from potentially modifying other users’ profiles by manipulating the auth0_id in the request. Additionally, profile editing is now restricted to only Auth0-authenticated users (excluding social login profiles).

OCT 03 NEW - Added user profile editing functionality for Auth0 users

Users can now update their profile information including first name, last name, and profile picture through a new PUT endpoint. The update synchronizes changes across Auth0, the database, and HubSpot, ensuring profile consistency across all systems. Additionally, the email verification endpoint has been renamed from ‘/update-user’ to ‘/verify-email’ for better clarity.

OCT 03 FIX - Fixed rank playground to prevent automatic model switching on failures

Fixed the rank playground feature to always set max_switch_model_retries to 0, preventing automatic model switching when ranking models. This ensures that model rankings are tested independently without fallback behavior, regardless of app settings or header policies. Also improved validation to require at least one message in playground requests and better error handling for invalid request IDs.

OCT 02 FIX - Fixed error response when no model can generate a valid answer

Changed the HTTP status code returned from 204 (No Content) to 411 (Length Required) when the system exhausts all retry attempts without generating a valid response. The error message now clearly states ‘(no answer generated)’ instead of ‘Empty response’. Additionally, the default policy for switching between models when requests fail has been reduced from 3 retry attempts to 1, meaning the system will now try a maximum of 2 models (original + 1 fallback) before returning an error.

September 2023

SEP 30 FIX - Fixed LlamaIndex integration payload handling and method chaining

Fixed an issue with the LlamaIndex integration where payload data and headers were not being properly initialized before processing requests. The fix ensures that populate_payload_data is called before process in all API endpoints (chat completions, completions, and playground), and adds validation to prevent processing without payload data. This resolves potential errors when using LlamaIndex for custom data retrieval and document querying.

SEP 29 NEW - Added Policies and Weights configuration with separate Settings for Apps

Restructured app configuration by separating model weights, policies, and general settings into distinct fields. The previous single ‘app_settings’ field has been renamed to ‘weights’ for model selection preferences, while new ‘policies’ field (based on LLMModelPolicies schema) controls model behavior constraints, and a new ‘settings’ field stores general app configuration. This allows for more granular control over app behavior and model selection strategies. Also improved file upload validation to prevent duplicate files with identical sizes from being uploaded.

SEP 27 IMPROVEMENT - Enabled optimization for internal requests in Playground by default

The Playground now automatically optimizes internal requests by default (optimize_internal_requests=1). This improvement should result in better performance and efficiency when using the Playground feature, as internal API calls will be optimized without requiring manual configuration.

SEP 27 IMPROVEMENT - Improved LlamaIndex data engine with custom keyword extraction and error handling

Enhanced the LlamaIndex-based document querying engine with a custom keyword extraction template that better identifies relevant keywords while avoiding stopwords. The engine now uses Claude Instant v1 as the default model for fast document indexing operations. Added comprehensive error handling that returns ‘(no answer)’ with HTTP 417 status when the engine fails to generate a response, instead of silently failing.

SEP 27 NEW - Added autocomplete support for Prometheus series, labels, vectors, sum, rate, and count queries

Enhanced the metrics proxy to support Prometheus series queries (api/v1/series) and label name value queries (api/v1/label/name/values) with proper filtering and autocompletion. These new endpoints enable users to query time series metadata and metric names that start with ‘pulze_’ prefix, improving the metrics exploration and query building experience. The filtering logic now handles metrics without names and applies key-based access control across all query types including vector operations like sum, rate, and count.

SEP 26 MODEL - Added MosaicML as a new AI provider

Added support for MosaicML as a new AI model provider. Users can now access MosaicML-hosted models through the API with support for temperature, top_p, and max_tokens parameters. The integration includes automatic token usage calculation using the GPT-NeoX-20B tokenizer and cost tracking per request.

SEP 26 FIX - Fixed TypeError in metrics filtering when receiving malformed Prometheus data

Fixed a TypeError that occurred when the metrics proxy received malformed or unexpected data formats from Prometheus. The fix adds validation to ensure metric data is properly structured as a dictionary with expected keys before filtering, preventing crashes when encountering invalid metric formats. Additionally, improved error handling now provides clearer error messages and logging when receiving invalid JSON responses, empty content, or missing required fields like ‘data’ and ‘result’ in the response structure.

SEP 26 FIX - Fixed LlamaIndex queries returning empty responses for custom data

Fixed an issue where LlamaIndex queries on custom data could return empty responses. The system now retries up to the configured max_same_model_retries limit when an empty response is received, and returns a fallback message ‘(no answer was generated)’ with a 417 status code if all retries are exhausted. This ensures users always receive a meaningful response instead of empty results when querying their custom data.

SEP 26 NEW - Added hierarchical log structure with parent-child relationships

Introduced a parent-child hierarchy for logs by adding a parent_id field to the request table. This allows logs to be organized in nested structures, enabling better tracking of related requests and sub-requests. The change includes database schema updates and modifications to the log retrieval system to support hierarchical log views.

SEP 26 IMPROVEMENT - Made end date optional for logs and dashboard stats queries

The date_to filter parameter is now optional when querying logs and dashboard statistics. When not provided, the system automatically defaults to the current time (UTC). This simplifies API requests where users want to retrieve data up to the present moment without manually specifying the end date.

SEP 25 NEW - Added Prometheus metrics endpoint with app-level filtering

Added a new metrics API endpoint (/metrics/prometheus-proxy) that proxies requests to Prometheus for monitoring data. The endpoint automatically filters metrics to show only those with the ‘pulze_’ prefix that belong to your specific app based on your API key, ensuring you only see metrics relevant to your application. Supports both GET and POST requests for querying Prometheus data.

SEP 25 IMPROVEMENT - Separated payment methods and payment intents into dedicated endpoint

Restructured the billing API by moving payment-related operations to a new dedicated /billing/payments endpoint. This architectural improvement separates payment method management (retrieving, adding, and managing payment cards) from other billing operations, making the API more organized and maintainable. Users will now interact with a cleaner API structure for managing their payment methods and viewing billing information including Stripe payment methods, setup intents, and organization credit balance.

SEP 25 IMPROVEMENT - Enhanced billing system with account balance and improved payment validation

Refactored the billing system to follow Stripe’s best practices. New users now receive a free starting balance in their account (configured per currency). Added comprehensive payment method validation including minimum charge verification ($0.50) and balance tracking. Improved payment method deletion with safety checks to prevent removing the last payment method. Changed billing information viewing from admin-only to viewer permissions, allowing more team members to see payment details.

SEP 23 FIX - Fixed prompt formatting and improved header filtering for data engine

Fixed an issue where chat prompts were being formatted with incorrect bracket notation (now uses plain role labels like ‘user’ and ‘assistant’ instead of ‘[user]’ and ‘[assistant]’). Also resolved a bug where the wrong prompt format was being logged in chat completions, and improved custom header extraction by filtering out additional common headers (host, origin, referrer) that were unnecessarily being stored.

SEP 23 FIX - Fixed playground model ranking with incorrect temperature and weight values

Fixed a critical bug in the playground where the temperature parameter was incorrectly using max_tokens value instead of the actual temperature setting, and weights were not being properly serialized when ranking models. This caused incorrect model recommendations in the playground interface. The fix ensures that model ranking now uses the correct parameters for accurate results.

SEP 22 FIX - Fixed organization invitation handling for declined invites and member removal

Fixed the ability to resend invitations to users who previously declined - the system now automatically removes the declined invitation and allows a new one to be sent. Also improved the member removal process to properly handle both active organization members and pending invitations, ensuring they are correctly deactivated and deleted from the database. Additionally, the invitation status field is now strictly validated to only accept ‘accepted’, ‘declined’, or ‘pending’ values.

SEP 21 IMPROVEMENT - Added balance information to frontend settings response

Frontend settings now include the organization’s complete balance information (credit balance, free balance, spending limits, pending expenses, and billing zip code). This provides users with immediate visibility into their account balance and spending status when accessing the application settings, without requiring a separate API call to the billing endpoint.

SEP 20 NEW - Enhanced retry policy with separate controls for same-model and model-switching retries

Introduced granular retry policies that allow independent configuration of same-model retries and model-switching retries. Users can now specify max_same_model_retries (attempts with the same model before switching) and max_switch_model_retries (attempts with different models after exhausting same-model retries), replacing the previous single max_retries parameter. The engine now intelligently rotates through ranked models based on these policies, providing better control over fallback behavior.

SEP 20 NEW - Multi-file upload support for custom app data

The custom data upload endpoint now accepts multiple files in a single request instead of just one file. Users can upload multiple files simultaneously to their apps, with each file being processed and stored individually. The API response now includes details about all successfully uploaded files.

SEP 19 IMPROVEMENT - Removed free credits for manually created organizations

Organizations created manually through the API no longer receive free signup credits. Only organizations created during user registration receive the initial free balance. This change simplifies the billing system and ensures consistent credit allocation, with pending charges now being synced and processed more reliably through the updated billing system.

SEP 19 FIX - Fixed redirect loop when accessing shared playground conversations

Fixed an infinite redirect loop that occurred when accessing shared playground conversations. The issue was caused by the optional bearer token authentication incorrectly reading tokens, which has been corrected by properly making the function async and passing the request object. Additionally, improved the error message to clarify when login is required for private conversations.

SEP 19 IMPROVEMENT - Enhanced model ranking API to include namespace and attribute details

Updated the rank_models endpoint to return additional metadata for each ranked model, including the full namespace (e.g., ‘provider/model_name’) and attribute information (the ’@’ suffix). This provides more complete model identification information when querying ranked models by score, making it easier to distinguish between different versions or variants of the same model.

SEP 19 NEW - Enabled billing system with $20 free credit for new organizations

Activated the billing system for all organizations (previously restricted to internal use only). New organizations now receive

20 USD in free credits upon signup (reduced from

50). The system automatically syncs pending expenses with Stripe when they reach a threshold, and now properly tracks organization-level rate limiting to prevent abuse during the free credit period.

SEP 19 IMPROVEMENT - Improved label filtering with date range and app support

Enhanced the label filtering system to support date range filtering and multiple app selection. When retrieving labels and label values, users can now apply the same date and app filters used in other searches, making label filtering consistent with the rest of the dashboard filtering capabilities. The API now uses a unified FilteredSearch schema for better consistency across endpoints.

SEP 18 IMPROVEMENT - Improved log timestamp precision from seconds to milliseconds

Log timestamps now store millisecond precision instead of second precision, providing more accurate timing information for API requests and responses. The system now uses time.time_ns() // 1_000_000 to capture timestamps in milliseconds, enabling better tracking and analysis of request latency and timing. A database migration automatically converts existing timestamps to the new millisecond format.

SEP 18 FIX - Fixed max_tokens parameter not being respected in Replicate provider

Fixed an issue where the max_tokens parameter was not being properly applied to Replicate API calls. The parameter is now correctly passed within the input object. Additionally, resolved a compatibility issue where passing temperature=0 would fail; the system now defaults to 0.75 (Replicate’s default) when temperature is set to 0.

SEP 18 IMPROVEMENT - Advanced sorting capabilities for application logs and request history

Enhanced the logs filtering interface to support multi-column sorting with customizable sort parameters. Users can now sort logs and application lists by multiple fields simultaneously (such as date, description, user information) instead of being limited to a single descending date sort. The sorting parameters can be passed through the API to create more complex query orderings for better data organization and analysis.

SEP 18 FIX - Fixed labels without policies failing to parse in request headers

Fixed an issue where custom labels sent in request headers would fail to parse if no policies were specified alongside them. The system now correctly handles labels independently of policy definitions, preventing request failures when using labels for tracking without associated policies like max_retries or timeout.

SEP 18 IMPROVEMENT - Changed organization API endpoint path from /org to /orgs

Updated the organization API endpoint from /org to /orgs for better REST API naming consistency. All API calls to organization-related endpoints now use the plural form /orgs instead of the singular /org. This change affects the API router configuration and test functions.

SEP 18 NEW - Added request policies for privacy levels and cost tracking

Introduced privacy level settings for API requests, allowing organizations to control data handling policies. Added cost tracking capabilities with a new ‘costs_incurred’ flag to distinguish between billable and non-billable requests. Organizations can now receive free credits through a new ‘free_balance’ field for managing promotional or trial usage.

SEP 17 IMPROVEMENT - Return unique label keys and values in label filtering queries

Improved label filtering to return only unique label keys and values by adding GROUP BY clauses to the database queries. This eliminates duplicate entries when retrieving available label keys and their corresponding values, making the label filtering interface cleaner and more efficient.

SEP 17 NEW - Added Playground with model ranking and logs functionality

Introduced a new Playground feature that allows users to test completions and chat completions with model ranking capabilities. The playground includes dedicated endpoints for ranking models based on weights, temperature, and prompts, and provides logs for monitoring requests. Users can now experiment with different settings and see which models perform best for their specific use cases before integrating them into their applications.

SEP 14 IMPROVEMENT - Improved LlamaIndex custom data processing with file metadata and response modes

Enhanced LlamaIndex integration to include file metadata (file names) when indexing documents, which provides better context for AI responses. Added support for multiple response modes (compact_accumulate, tree_summarize, refine, simple_summarize, no_text, accumulate, compact) configurable via headers, and implemented proper temp directory cleanup to prevent resource exhaustion and hanging issues.

SEP 13 NEW - Added ability to delete custom data files from apps

Users can now delete custom data files that have been uploaded to their apps. The API endpoint has been updated to allow deletion by file ID (/custom-data//files/), and file size tracking has been added to custom data uploads to display how much storage each file uses. This gives users better control over managing their app’s data and storage.

SEP 13 IMPROVEMENT - Custom Files now stored in database instead of filesystem

Custom data files uploaded to apps are now stored directly in the database using a new app_custom_data table, replacing the previous filesystem-based storage. This improves data management, backup reliability, and simplifies deployment architecture. Files are temporarily extracted to disk only during query processing and automatically cleaned up afterward.

SEP 13 NEW - Added LlamaIndex integration with custom data support via /completions endpoint

The /completions endpoint now supports LlamaIndex integration for querying custom uploaded documents. When files are uploaded for an app, the endpoint automatically switches to using LlamaIndex with a KeywordTableIndex to query the custom data instead of standard completions. The previous /llama endpoint has been consolidated into the main /completions endpoint, providing a unified interface for both standard and custom-data-powered completions.

SEP 12 NEW - Custom data upload and retrieval for apps using LlamaIndex

Added ability to upload custom data files to apps and query them using LlamaIndex integration. Users can now upload files through POST /apps/custom-data/, delete files via DELETE endpoint, and view uploaded files with their MIME types when retrieving app details. The completions endpoint now supports custom data retrieval by loading uploaded files from app-specific directories and using LlamaIndex’s KeywordTableIndex for semantic search over the custom documents.

SEP 11 NEW - Added autocomplete support for log label keys and values filtering

Added a new /labels endpoint that enables autocomplete functionality for filtering logs by labels. Users can now retrieve all available label keys across their logs, or when a specific key is provided, fetch all possible values for that label key. This improves the filtering experience by allowing users to discover and select from existing label keys and values rather than typing them manually.

SEP 08 IMPROVEMENT - Updated custom header names from 'Custom-Labels' to 'Pulze-*' prefix

Changed the custom header naming convention for better consistency and clarity. Headers for passing custom configuration are now prefixed with ‘Pulze-’ (e.g., ‘Pulze-Labels’, ‘Pulze-Weights’, ‘Pulze-Policies’) instead of the previous ‘Custom-Labels’ format. This standardization makes it clearer which headers are Pulze-specific and improves API consistency across all custom configuration options.

SEP 07 FIX - Fixed billing zip code field returning incorrect data in API response

Corrected the billing information endpoint to properly return the billing zip code field. Previously, the endpoint was incorrectly configured and wasn’t returning the billing_zip value in the organization spending limits response, which now properly includes this field to match the expected data model.

SEP 06 NEW - Support for multiple payment methods with duplicate card detection

Organizations can now add and manage multiple payment methods for billing. The system automatically detects and prevents duplicate cards using Stripe’s card fingerprint verification. Added a new delete endpoint to remove payment methods, and top-up payments can now specify which payment method to use instead of defaulting to a single card.

SEP 05 FIX - Fixed model ranking endpoint by removing deprecated max_num_models parameter

Fixed an issue with the model ranking API endpoint that was failing due to the removal of the ‘max_num_models’ attribute from the request payload. The endpoint now uses a fixed constant of 3 models for ranking instead of accepting a user-configurable parameter, ensuring consistent behavior and preventing errors when requesting ranked model recommendations.

SEP 05 IMPROVEMENT - Improved temporary email domain validation performance from O(n) to O(1)

Optimized the temporary email domain checking mechanism by switching from a List to a Set data structure when validating against approximately 162,000 disposable email domains. This change reduces lookup time from linear O(n) complexity to constant O(1) complexity, providing near-instantaneous domain validation regardless of list size. The performance improvement is particularly noticeable when processing multiple email validations.

SEP 05 MODEL - Added knowledge graph seed data with performance metrics for AI models

Added comprehensive knowledge graph seed data containing performance benchmarks across 20 categories for multiple AI model providers including AI21 Labs (j2-ultra, j2-mid, j2-light with 8191 token limits), Aleph Alpha (luminous-supreme, luminous-supreme-control, luminous-base-control with 1990 token limits), and others. Each model includes detailed metrics such as pricing per token, latency ratings, weight scores, and category-specific performance scores across domains like Arts & Crafts, Technology & Gadgets, Health & Wellness, and more. This data enables better model selection and routing based on use case requirements.

SEP 02 MODEL - Renamed Claude v2 model identifier from 'claude-v2' to 'claude-2'

Updated the Anthropic Claude v2 model identifier from ‘anthropic/claude-v2’ to ‘anthropic/claude-2’ for consistency with Anthropic’s naming conventions. This change affects how the model is referenced in API calls. The model maintains its 100K token limit and description as Anthropic’s best-in-class offering for complex reasoning tasks.

SEP 01 IMPROVEMENT - Enhanced model listing with structured data and configurable API key prefix

Improved the models API endpoint to return structured model information including provider, model name, and type details instead of plain strings, making it easier to integrate and display model data. Added a new /models/all endpoint to retrieve all available models in the platform. API keys now use a configurable prefix (via KEY_PREFIX setting) instead of hardcoded ‘sk-’ prefix, and validation ensures keys start with the correct prefix.

SEP 01 FIX - Fixed GooseAI completion response handling for null finish_reason values

Fixed an issue where GooseAI API responses with null finish_reason values would cause errors. The provider now explicitly handles null finish_reason by converting it to an empty string, ensuring more reliable response processing when using GooseAI models.

SEP 01 NEW - Added public documentation endpoint displaying available AI models in table format

Introduced a new /docs-models-table endpoint that returns an HTML table of available models for documentation purposes. For requests from allowed documentation origins (docs.pulze.ai), the table displays all models including third-party providers. For requests from other origins, only synthetic Pulze models (pulze, pulze-v0) are shown for security. The table includes model names, descriptions, providers, token limits, cutoff dates, and active status with sortable columns.

SEP 01 IMPROVEMENT - Removed deprecated 'category' field from model ranking API response

The ‘category’ field has been removed from the PulzeEngineModelRanking response schema in the ranked models endpoint. This field was previously deprecated and always returned “(deprecated)” as its value. API responses will now only include the ‘models’ field containing the ranked list of models, making the response cleaner and more straightforward.

SEP 01 IMPROVEMENT - Improved billing system with background charging and pending expense tracking

API request charges are now processed in the background instead of synchronously, improving response times. The system now tracks pending expenses between Stripe syncs with new fields (pending_expense, expense_synced_at, currency) in the organization table. Added the ability to update spending limits (hard/soft) directly through a new API endpoint, and enhanced top-up functionality with background email notifications.

SEP 01 IMPROVEMENT - API now returns request log ID even when requests fail

The API now consistently returns the request log ID (log_id) in error responses, making it easier to track and debug failed requests. Previously, when a request failed, the log ID was not included in the error response. This improvement includes latency information in failed request responses and ensures background tasks complete even when errors occur.

August 2023

AUG 31 NEW - Added conversation sharing functionality for playground

AUG 31 MODEL - Added CodeLlama 13B model on Replicate

Added support for CodeLlama 13B (replicate/codellama-13b), a 13-billion-parameter Llama model tuned for code completion with 2048 max tokens. The model is initially disabled by default and uses an optimized tokenization approach that doesn’t require loading the full LLaMA tokenizer.

AUG 31 MODEL - Added Claude 2 model from Anthropic with 100K context window

Added support for Anthropic’s Claude 2 model (anthropic/claude-2) with 100,000 token context window. Claude 2 is described as Anthropic’s best-in-class offering with superior performance on tasks that require complex reasoning. This update also upgrades the Anthropic SDK to version 0.3.11 with improved error handling for timeouts, connection errors, and rate limits.

AUG 30 NEW - Added Prometheus metrics endpoint for model monitoring

Added a new Prometheus-compatible metrics endpoint to expose model performance and usage statistics. The ModelMonitor has been updated to use a new response type that better integrates with Prometheus monitoring and alerting systems, enabling improved observability of model API calls, latencies, and error rates.

AUG 23 FIX - Fixed access control to prevent interaction with soft-deleted apps

Fixed a security issue where soft-deleted apps (is_active=False) could still be accessed through API keys, log queries, and shared playground conversations. Now all database queries properly filter out inactive apps, ensuring deleted apps are completely inaccessible. Also added proper 404 error handling for shared playground conversations when no chats are found.

AUG 22 IMPROVEMENT - Improved model scoring to use task-specific category performance

Enhanced the model selection algorithm to prioritize models based on their performance in specific task categories (e.g., coding, reasoning, writing). When selecting a model for a particular task, the system now uses the model’s category-specific quality score if available, falling back to the average across all categories only when needed. The scoring formula was also improved to properly weight quality (higher is better), latency (lower is better), and cost (lower is better) with normalized values in the 0-1 range.

AUG 21 IMPROVEMENT - Increased Playground model benchmark limit from 2 to 3 models

The Playground now supports benchmarking up to 3 models simultaneously, increased from the previous limit of 2 models. This allows users to compare performance and outputs across three different language models in parallel during completion requests, making it easier to evaluate model behavior side-by-side.

AUG 19 NEW - Added health check endpoint with database and Redis monitoring

Added a new /healthz endpoint that monitors API service health by checking database and Redis connectivity. The endpoint returns detailed status information including whether each service is active, individual latency measurements for database and Redis operations in milliseconds, total request latency, and current server time. This enables better monitoring and troubleshooting of the API infrastructure.

AUG 18 NEW - Enhanced Playground with conversation sharing and improved response handling

AUG 15 IMPROVEMENT - API endpoints now accessible with App API Keys and fixed model weights handling

Multiple API endpoints including logs, models, and app updates can now be accessed using App-specific API Keys in addition to user tokens. When using an App’s API Key, log endpoints automatically filter to only show logs for that specific app. Additionally, fixed an issue where model weights would not be properly applied when ranking models - the system now correctly falls back to the app’s default settings when custom weights are not provided.

AUG 14 NEW - Model ranking endpoint now accessible via API key authentication

Added API key authentication support to the /models/rank endpoint, allowing users to retrieve ranked best model recommendations programmatically. The endpoint now accepts playground completion requests with configurable parameters including messages, temperature, max_tokens, and a new max_num_models parameter (default: 2) to control how many top-ranked models are returned with their scores. Previously, this model ranking functionality was only available through other authentication methods.

AUG 14 IMPROVEMENT - Added model scoring details to API response metadata

API responses now include detailed model scoring information in the metadata, showing the ranking and scores of the top models considered for each request. This provides transparency into how the system selected the best model, including quality, latency, and overall scores for the top candidates evaluated by the engine.

July 2023

JUL 27 NEW - Added invitation email system for joining organizations

Implemented a comprehensive email invitation system that sends welcome emails to users invited to join an organization. New users receive an email with an email verification link, while existing verified users receive a welcome email without verification. The system includes Auth0 integration for email verification, custom email templates with organization and inviter details, and automatic handling of verification tickets. Users are redirected to the platform after completing the invitation process.

JUL 22 NEW - Added billing system with Stripe integration for credit management

Introduced a comprehensive billing system that allows organizations to manage payment methods and top up account credits through Stripe. Organizations now have a credit balance tracking system, can add and store payment methods (cards), view payment information, and perform top-ups that automatically update their account balance. The system includes support for spending limits (soft and hard) and creates transaction history for all balance changes. New users automatically get a Stripe customer account created during registration.

JUL 22 IMPROVEMENT - Improved model selection speed and optimized performance

Significantly improved the speed of model selection by optimizing the model transformer initialization. The sentence transformer model (paraphrase-distilroberta-base-v1) is now loaded once globally instead of per-request, reducing latency for prompt classification. Additionally, the scoring algorithm now skips computation entirely when users specify a model directly, further improving response times.

JUL 21 IMPROVEMENT - Improved Playground model recommendations with request payload context

Enhanced the Playground’s model recommendation engine to consider the complete chat completion request payload when suggesting best models. The engine now receives full context including messages, parameters, and settings to provide more accurate model recommendations tailored to your specific request.

JUL 19 FIX - Improved error messages for organization invitations and fixed name parsing

Updated invitation error messages to be clearer and more actionable, including specific guidance when invitations are not found, have incorrect status, or belong to different emails. Fixed a bug in name parsing where users with single-word names (no space in full name) would cause errors, now properly handles names without spaces by setting last name as empty string.

JUL 14 IMPROVEMENT - Renamed 'Keys' to 'Apps' throughout the application

The terminology for managing API access has been updated from ‘Keys’ to ‘Apps’ across the entire application. This includes renaming the /keys API endpoint to /apps, updating database tables and columns (key → app, key_configuration → app_settings, key_id → app_id), and revising all related UI references. This change provides clearer terminology that better reflects the concept of managing application configurations rather than just API keys.

JUL 13 NEW - Allow setting model selection weights via custom-labels header

Users can now override default model selection weights (quality, cost, latency) by passing a custom ‘weights’ object in the custom-labels header. This allows fine-grained control over model selection criteria on a per-request basis, enabling users to optimize for specific priorities like cost-efficiency or response quality without changing their API key configuration.

JUL 13 IMPROVEMENT - Added log filtering capability in backend with organization-level access control

Enhanced the logs endpoint to support filtering capabilities through a new FilterLogsRequest parameter, allowing users to narrow down their log queries. Additionally, strengthened security by enforcing organization-level access control across all key and log operations, ensuring users can only access logs and API keys belonging to their organization. Added a response_text column to the request table for improved log readability.

JUL 11 MODEL - Introduced pulze and pulze-v0 synthetic models with automatic routing

Added support for two new synthetic model identifiers: ‘pulze’ and ‘pulze-v0’. When users specify either of these models in their API requests, the system automatically routes to the optimal underlying model through Pulze’s intelligent selection. These models bypass the standard allowed model list and are available to all API keys, enabling users to leverage Pulze’s model routing without specifying a particular provider’s model.

JUL 10 NEW - Added Terms of Service and Privacy Policy acceptance tracking

Users’ Terms of Service and Privacy Policy acceptance is now tracked with timestamps. The API now returns the last review dates for both documents through a new /general/settings endpoint, allowing the frontend to display when users last accepted these policies. Added a new /general/accept-terms endpoint to record when users accept either the privacy policy or terms of service.

JUL 10 IMPROVEMENT - Auto-generated names for Apps created without a description

Apps (API keys) created without a description now automatically receive a randomly generated name combining an adjective and noun (e.g., ‘autumn_waterfall’, ‘silent_moon’). Previously, Apps could be created with no name, making them difficult to identify. This improvement makes it easier to distinguish between multiple Apps in your organization.

JUL 07 FIX - Fixed dashboard graph data swapping between errors and latency metrics

Corrected a bug where the Errors and Latency graphs were displaying each other’s data. The Errors graph now correctly shows the count of failed requests (status code >= 400), while the Latency graph displays average request latency in seconds. Additionally, the Savings graph now displays positive values instead of negative values, and graph calculations now properly handle datasets with zero values.

JUL 07 MODEL - Replaced gpt-neo-20b with replicate/dolly-v2-12b model

Switched the model monitoring from gpt-neo-20b to replicate/dolly-v2-12b. Additionally, the falcon-40b-instruct model has been deactivated and is no longer available for use. This change updates the model lineup to use Dolly v2 12B from Replicate as the monitored model.

JUL 07 NEW - Added configurable graph display with cumulative data toggle for dashboard

The dashboard now allows the API to control which graphs are displayed and whether data should be shown cumulatively or as individual data points. This includes four main graphs: Requests (blue), Errors (red), Latency (blue), and Cost Savings (green). Users can now toggle between cumulative and non-cumulative views for each graph independently through the API configuration, providing more flexible data visualization options.

JUL 05 MODEL - Added AlephAlpha and Hugging Face providers with multiple models

Added two new AI providers to the knowledge graph: AlephAlpha with luminous-supreme model (1990 token limit, $0.000038 per token) and Hugging Face integration. The knowledge graph has been updated with new performance metrics across 20 content categories for existing models, showing recalibrated scoring data for better model selection and routing.

JUL 04 IMPROVEMENT - Updated graph colors to use Pulze brand palette

Refreshed the visual appearance of analytics graphs to use the official Pulze color palette. Requests and Latency graphs now display in Pulze blue (#017EFA), Error graphs in red (#EF4444), and Savings graphs in green (#14BD81) for improved brand consistency and visual clarity.

JUL 04 IMPROVEMENT - Made colored log output opt-in via PULZE_LOG_COLOR environment variable

Colored logging is now disabled by default and can be enabled by setting the PULZE_LOG_COLOR environment variable to ‘True’. Previously, colored output was always enabled, which could cause issues in environments that don’t support ANSI color codes. The logger now defaults to plain text formatting unless explicitly configured otherwise.

JUL 04 MODEL - Disabled GooseAI models (gpt-j-6b and gpt-neo-20b) by default

GooseAI models gpt-j-6b and gpt-neo-20b (both with 2048 max tokens) have been disabled by default and are no longer active for use. These models can still be manually enabled if needed, but will not be available in the default model selection.

JUL 04 MODEL - Removed deprecated base OpenAI models: davinci, curie, babbage, and ada

Cleaned up four legacy OpenAI base models (davinci, curie, babbage, and ada) from the knowledge graph seed data. These older base model configurations have been removed while their versioned counterparts (text-davinci-003, text-curie-001, text-babbage-001, and text-ada-001) remain available. This streamlines the model catalog by removing duplicate entries for deprecated model naming conventions.

JUL 04 NEW - Added cumulative cost savings graph to analytics dashboard

Added a new cumulative cost savings graph that tracks the total amount saved over time by using the platform. The graph displays savings in dollars and accumulates values across the selected time period, providing visibility into overall cost optimization. The requests graph color was also updated from green to blue to differentiate it from the new green savings graph.

JUL 04 MODEL - Added Replicate provider with dolly-v2-12b model support

Added support for Replicate as a new AI provider, including the dolly-v2-12b model (replicate/dolly-v2-12b). This integration includes automatic token counting using the databricks tokenizer, cost calculation, and full completion API support with configurable temperature and max_tokens parameters.

JUL 03 NEW - Added playground model recommendations and new model update tracking for API keys

Added a new endpoint ‘/playground-best-models’ that returns the top 2 recommended models based on the engine’s analysis of your completion request. Enhanced the API key management system to track newly available models and provide model update information when retrieving or updating keys. Added a new ‘/merge-models’ endpoint that allows enabling or disabling newly available models for specific API keys in bulk.

JUL 03 MODEL - Added Huggingface provider with falcon-40b-instruct model support

Added support for Huggingface as a new LLM provider, enabling access to the open-source falcon-40b-instruct model. The integration includes automatic token counting using the model’s tokenizer, cost calculation, and error handling for endpoint availability. Users can now select Huggingface models through the API alongside existing providers like OpenAI, Anthropic, and Cohere.

June 2023

JUN 30 IMPROVEMENT - User profile information now syncs automatically on every sign-in

User profiles are now automatically updated with the latest information from Auth0 during both sign-in and sign-up. This ensures profile details like name, email, email verification status, and profile picture remain synchronized between authentication provider and the application. Previously, existing users’ information was not updated after initial registration.

JUN 29 FIX - Fixed GooseAI provider response handling error

Fixed a bug in the GooseAI provider where usage data was being read from the wrong response object (response_openai instead of res), which would cause errors when processing completion requests. This ensures that token usage information is correctly extracted from the API response.

JUN 29 FIX - Fixed missing 'choices' key in GooseAI provider responses

Fixed a critical bug in the GooseAI provider where the ‘choices’ key was not being properly extracted from API responses, which would cause request failures. The fix ensures that completion choices are now correctly populated in the response object before calculating tokens and metadata.

JUN 29 IMPROVEMENT - Added latency tracking for Aleph Alpha provider

Improved latency tracking across all LLM providers with standardized measurement. The Aleph Alpha provider now includes latency metrics in response metadata, matching the behavior of other providers like OpenAI, Anthropic, and Cohere. Latency is measured in seconds and rounded to 4 decimal places for consistency.

JUN 29 FIX - Fixed dashboard showing incorrect minute intervals before 1AM

Fixed a calculation error in the dashboard’s time interval generation that caused incorrect minute-level data grouping for time periods before 1AM. The issue was caused by calculating the range size in hours and multiplying by 60, instead of directly calculating the range in minutes. This affected minute-granularity charts displaying data for time ranges under 8 hours.

JUN 28 NEW - Added ability to fetch individual log entries by ID

Users can now retrieve a specific log entry by its unique ID through the GET /logs/ endpoint. This enhancement improves log inspection capabilities by allowing direct access to individual log records instead of only viewing paginated lists. The endpoint includes proper authorization checks to ensure users can only access logs from keys within their organization.

JUN 28 FIX - Remove incorrect timezone suffix from date responses

Fixed date format in statistics graph responses by removing the incorrect ‘.000Z’ timezone suffix. Date fields now return in the format ‘YYYY-MM-DDTHH:MM:SS’ instead of ‘YYYY-MM-DDTHH:MM:SS.000Z’, providing more accurate timestamp representation without falsely implying UTC timezone when timezone information wasn’t being properly set.

JUN 25 MODEL - Added AI21 Labs and Aleph Alpha models with performance scores

Added support for AI21 Labs models (j2-ultra, j2-mid, j2-light with 8191 token context) and Aleph Alpha models (luminous-supreme and luminous-supreme-control with 1990 token context). All models include performance scores across 20 categories including Arts & Crafts, Technology & Gadgets, Business & Finance, and more. Updated knowledge graph to version 2023-06-22 with complete category scoring for intelligent model routing.

JUN 23 MODEL - Added AlephAlpha Luminous model family with 4 new models

Added support for AlephAlpha’s Luminous model family with 4 models: luminous-supreme (largest, best for creative writing), luminous-supreme-control, luminous-base-control (fastest and cheapest, ideal for classification), and luminous-extended-control. All models support up to 1,990 max tokens and are optimized for different use cases including information extraction, language simplification, classification, and labeling tasks.

JUN 23 FIX - Fixed missing creation timestamp in model API responses

Fixed an issue where API responses from model endpoints could be missing the ‘created’ timestamp field. The timestamp generation has been moved from the database layer to the engine provider layer, ensuring all responses include a proper Unix timestamp indicating when the response was created, even when errors occur during request processing.

JUN 22 FIX - Fixed OpenAI API responses being incorrectly overwritten

Resolved a critical bug where OpenAI API responses were being overwritten, causing the service to break. The fix ensures that response data from OpenAI’s completion and chat completion endpoints is now properly preserved by storing the API response in a separate variable (response_openai) before extracting choices and usage data into the final response object. This affects both OpenAI text completion and chat completion models.

JUN 22 IMPROVEMENT - Enhanced playground chat response labels with optimization goal display

Improved the playground chat interface to display the active optimization goal (e.g., ‘Optimizing for: cost’ or ‘Optimizing for: latency’) as a separate label. The weight distribution label now uses an ‘info’ style instead of ‘success’ for better visual distinction between the optimization goal and weight values.

JUN 22 NEW - Added support for AI21 Labs, Aleph Alpha, and Anthropic providers

Expanded API provider support by adding three new vendors: AI21 Labs, Aleph Alpha, and Anthropic. These providers are now included in the API key seeding configuration alongside existing providers (Cohere, GooseAI, and OpenAI), allowing users to configure and use models from these additional vendors through the API.

JUN 21 MODEL - Added AI21 Labs (AI21 Studio) provider support

Added support for AI21 Labs (now called AI21 Studio) as a new model provider. Users can now access AI21 Labs models including j2-mid for text completion tasks. The integration includes full support for token counting, cost calculation, and latency tracking with configurable parameters like temperature, top_p, max_tokens, and number of results.

JUN 21 MODEL - Added Aleph Alpha provider support

Added integration with Aleph Alpha as a new AI model provider. Users can now access Aleph Alpha’s language models through the API, with full support for completions, token counting, cost calculation, and latency tracking. The provider is now available alongside existing providers (OpenAI, Anthropic, Cohere, GooseAI).

JUN 20 MODEL - Added AI21 Labs (AI21 Studio) provider support

Added support for AI21 Labs (now called AI21 Studio) as a new AI provider. Users can now access AI21’s J2-Mid model through the platform for text completion tasks. The integration includes full support for model parameters like temperature, top_p, max_tokens, and multiple completions (n parameter), along with token usage tracking and cost calculation.

JUN 20 FIX - Fixed Anthropic provider temperature parameter typo

Corrected a typo in the Anthropic provider where the temperature parameter was misspelled as ‘temperatur’, causing the temperature setting to be ignored during API calls. This fix ensures that temperature values are now properly applied to Anthropic model completions, allowing users to correctly control response randomness.

JUN 17 NEW - Added per-minute dashboard view and API key filtering for analytics

Dashboard analytics now support per-minute granularity for time ranges under 8 hours, enabling more detailed monitoring of recent API usage patterns. Added the ability to filter dashboard metrics by specific API keys, allowing users to analyze performance and costs for individual keys. The key creation response now includes the key_id field for easier reference in filtering.

JUN 16 IMPROVEMENT - Updated knowledge graph to 2023-06-15 version with improved token handling

Updated the model knowledge graph to the 2023-06-15 version with enhanced data accuracy. Improved token usage tracking by changing category types from integers to floats for more precise cost calculations and model weight handling. Fixed token usage validation in OpenAI providers to properly handle new usage data fields and log warnings for unknown keys.

JUN 16 NEW - Added /models endpoint to retrieve available models

Added a new API endpoint at /models that allows users to retrieve a list of active models based on their API key permissions and current model availability. This endpoint validates the user’s API key and returns models filtered by their account’s model settings and the knowledge graph stored in Redis.

JUN 15 IMPROVEMENT - Reduced API cold start latency by setting minimum instance count to 25

Improved API response times by configuring the Cloud Run autoscaler to maintain a minimum of 25 instances running at all times. This eliminates cold starts for most requests, ensuring faster and more consistent API response times, especially during periods of low traffic or after idle periods.

JUN 15 IMPROVEMENT - Improved request handling by changing concurrency model to prevent thread conflicts

Changed the API to process requests sequentially (concurrency=1) instead of using async threadpools to resolve thread safety issues with PyTorch model instances and SentenceTransformer. CPU allocation reduced from 4 cores to 1 core per container to match the new single-request processing model. This ensures more stable and reliable request processing, though individual requests may now be handled one at a time per worker instance.

JUN 15 NEW - Added interactive playground API for testing chat completions

Introduced a new playground API endpoint that allows users to test chat completions directly through the UI. The playground now displays comprehensive response metadata including provider and model information, latency metrics, cost estimates, and configurable quality/speed/cost weighting preferences. Users can save and review their playground conversations with detailed labels showing performance characteristics.

JUN 14 IMPROVEMENT - Improved categorization accuracy and API performance with parallel processing

Switched the prompt categorization system back to the SBERT (Sentence-BERT) model using ‘paraphrase-distilroberta-base-v1’ for better accuracy, replacing the previous TF-IDF vectorizer approach. Additionally, enabled multi-worker processing in both development and production environments by utilizing all available CPU cores, which significantly improves API throughput and request handling capacity.

JUN 10 NEW - Added Dashboard statistics API and playground mock fetch capability

Introduced a new Dashboard API endpoint that provides comprehensive usage statistics including request counts, token usage, cost savings metrics, and latency data. Added playground functionality with mock fetch capabilities for testing API requests. Enhanced database views to track latency metrics from request metadata for better performance monitoring.

JUN 08 MODEL - Added support for Anthropic Claude v1 text completion models

Added support for Anthropic’s Claude v1 models for text completion requests. The integration includes full token calculation, cost tracking, and usage metrics. Users can now access Claude models through the Pulze API with automatic prompt formatting using Anthropic’s HUMAN_PROMPT and AI_PROMPT format, with support for configurable temperature and max tokens parameters.

JUN 08 FIX - Fixed token calculation for Cohere text completion API responses

Fixed token counting for Cohere API responses when using the text_completion endpoint. The system was incorrectly looking for tokens in choice[‘message’][‘content’] instead of choice[‘text’], which caused token calculation failures. Also corrected the provider name from ‘cohereai’ to ‘cohere’ in the knowledge graph configuration.

JUN 08 FIX - Standardized request type naming and fixed Cohere response format

Standardized all text completion request types from ‘completions’ to ‘text_completions’ across all providers for consistency. Fixed Cohere text completion response format to return ‘text’ field directly instead of wrapping it in a ‘message’ object with role and content, aligning with standard text completion response structure.

JUN 08 IMPROVEMENT - Added status code to logs endpoint response

The /logs endpoint now returns the HTTP status code for each request in the response data. This allows users to see the status code (e.g., 200, 400, 500) alongside other request details like prompt, payload, and response, making it easier to track and debug API request outcomes.

JUN 08 FIX - Fixed trailing whitespace in least_connection mode configuration

Fixed a configuration bug where the least_connection mode setting had trailing whitespace for Cohere, OpenAI, and GooseAI vendor environment variables. This whitespace could have caused the mode to be incorrectly interpreted, potentially affecting load balancing behavior across these API providers.

JUN 08 FIX - Fixed API key validation crashes when Redis keys are missing or invalid

Fixed a critical issue where the API key validation system would crash when attempting to decode Redis values that were missing or returned unexpected data types. The system now gracefully handles these exceptions by logging the error and returning false for invalid keys, preventing service disruptions during rate limit checks.

JUN 07 FIX - Fixed missing Cohere provider in API key seeding process

Fixed an issue where Cohere API keys were not being included when seeding the Redis database with provider API keys. The seeding script now properly includes Cohere alongside OpenAI and GooseAI providers. Added verification logging to confirm all three providers (Cohere, OpenAI, GooseAI) are successfully seeded.

JUN 06 IMPROVEMENT - Renamed Cohere AI provider and added cloud infrastructure support

Renamed the ‘cohereai’ provider to ‘cohere’ throughout the system for consistency with the official provider name. This affects the Cohere command and command-light models (4096 token max). Added infrastructure support for Cohere API keys in cloud deployments with least-connection load balancing mode.

JUN 05 NEW - Added CohereAI provider support with completions endpoint

Added initial support for CohereAI as a new LLM provider, enabling users to access Cohere’s language models through the completions API. The integration includes automatic token calculation, cost tracking, latency metrics, and full error handling. Additionally, fixed the scoring algorithm to properly handle inverse metrics for cost and latency optimization when selecting models.

JUN 05 IMPROVEMENT - Improved error handling in log exporter with proper exit codes

Enhanced the log export process to properly exit with error status code 1 when aborting due to missing ‘good_answer’ column tables. This ensures that automated scripts and CI/CD pipelines can correctly detect when log export operations fail, preventing silent failures in production environments.

JUN 04 FIX - Fixed organization list to include personal organizations

Fixed an issue where users with only a personal organization would receive an empty organization list. The API now correctly returns all organizations a user belongs to, including their personal organization, ensuring users always see their available organizations.

JUN 03 NEW - Added log rating system and organization billing information management

Users can now rate API request logs with a thumbs up/down and provide textual feedback through a new rating endpoint. Organization profiles can now be updated with comprehensive billing information including address fields (address_1, address_2, city, zip, state, country), billing email, and spending limits (soft and hard limits). Personal organizations are restricted from these modifications.

JUN 02 IMPROVEMENT - Improved error messaging for rate limit and capacity errors

When the service is over capacity or hits rate limits, users now receive a clear, actionable error message: ‘We are currently over capacity. Please try again later, and if the problem persists, contact [email protected] for further assistance.’ Previously, users would see raw technical error messages. Additionally, error details are now logged for better debugging and support.

JUN 02 FIX - Fixed rate limit error responses to display properly as text

Fixed an issue where rate limit errors (HTTP 429) were not being properly formatted when returned to users. The error detail is now converted to a string format, ensuring error messages are displayed correctly instead of potentially showing object representations. This affects all API endpoints that enforce rate limiting.

JUN 02 IMPROVEMENT - Improved over capacity error message clarity

Updated the error message shown when the service is over capacity from ‘We are over capacity’ to ‘We are currently over capacity’ to provide clearer communication about the temporary nature of the issue. This message appears when OpenAI or GooseAI API keys are unavailable, and continues to direct users to contact [email protected] if problems persist.

JUN 02 IMPROVEMENT - Improved error handling when API service is over capacity

Added proper error handling for when the service exceeds capacity and Redis cannot provide valid API keys. Users now receive a clear HTTP 429 (Too Many Requests) error message stating “We are over capacity. Please try again later, and if the problem persists, contact [email protected] for further assistance” instead of experiencing undefined behavior. This applies to OpenAI and GooseAI API endpoints.

JUN 01 IMPROVEMENT - Disabled public OpenAPI specification endpoint for improved security

The OpenAPI specification endpoint (/api/v1/openapi.json) is now disabled by default to prevent exposing internal API schema details in production. This security enhancement prevents potential attackers from viewing the complete API structure and available endpoints. The endpoint can still be manually enabled in development environments if needed.

JUN 01 NEW - Add Custom-Labels header support for request tagging and categorization

Added support for a new ‘Custom-Labels’ header that allows users to attach custom key-value labels to API requests for tracking and categorization purposes. Labels are passed as JSON objects in the header and are returned in the response metadata, enabling users to organize and filter requests by custom dimensions like environment (internal/external), request type, or any other custom attributes.

JUN 01 NEW - Added support for targeting specific models in API requests

Users can now specify a target model in their API requests using the ‘model’ parameter, allowing direct model selection while still validating against allowed models for the API key. The engine will use the specified model instead of the automatic scoring system, enabling manual model selection for testing and specific use cases. Added friendly error handling with HTTP 418 status for requests targeting the ‘pulze’ model name.

May 2023

MAY 31 IMPROVEMENT - Implemented load balancing across API keys with Redis for OpenAI and GooseAI

Added intelligent load balancing system to distribute requests across multiple API keys for OpenAI and GooseAI providers. The system supports least-connection mode to select API keys with the lowest active request count, respects rate limits (RPM - requests per minute) for each key, and automatically tracks usage counters in Redis. This helps prevent rate limit errors and improves reliability by distributing load across available API keys.

MAY 31 NEW - Added sortable columns to API Keys table

API Keys table now supports sorting by multiple columns with ascending or descending order. Users can sort keys by any column (such as creation date, name, or request counts) and specify custom sort orders. The endpoint was changed from GET to POST to accept sorting parameters including column name, sort direction (asc/desc), and whether to enable multiple column sorting.

MAY 31 IMPROVEMENT - Updated billing and organization permissions for enhanced access control

Changed billing endpoint permissions from Viewer to Editor level, requiring Editor access to view billing information across all time periods (minute-by-minute, daily, and monthly usage). Additionally, updated organization admin list visibility to require Admin permissions instead of Viewer permissions, and removed the Admin permission requirement for creating new organizations, allowing any authenticated user to create their first organization.

MAY 30 IMPROVEMENT - Added cost savings calculation and improved cost estimation accuracy

Introduced a new ‘cost_savings’ field in API response metadata that calculates potential savings based on provider and usage patterns. Improved cost calculation by computing costs before adding to metadata, ensuring more accurate estimates. The cost savings feature compares actual costs against baseline costs to show users how much they’re saving through optimized routing and model selection.

MAY 30 IMPROVEMENT - Added model validation and improved error handling for unavailable models

Implemented validation logic to verify that models are both allowed for an API key and active in the knowledge graph before processing requests. The system now returns clear error messages when the knowledge graph is unavailable (503 Service Unavailable) or when no valid models are found for an API key (404 Not Found), preventing requests from failing silently and improving debugging capabilities.

MAY 26 IMPROVEMENT - Improved error handling with correct HTTP status codes for API requests

Enhanced error handling across OpenAI and GooseAI API endpoints to return more accurate HTTP status codes. Rate limit errors now properly return HTTP 429 (Too Many Requests) instead of 404, invalid request errors return HTTP 400 (Bad Request) with detailed error messages, and unexpected errors return HTTP 500 (Internal Server Error). This provides clearer feedback when API calls fail and helps developers debug issues more effectively.

MAY 26 IMPROVEMENT - Improved HTTPS handling and proxy support for API deployment

Enhanced the API server to properly handle HTTPS connections when deployed behind reverse proxies like Nginx. Added automatic HTTPS redirect middleware to ensure all connections use secure protocols, and configured Uvicorn with forwarded-allow-ips=’*’ to correctly process forwarded headers from proxy servers. This replaces the previous custom redirect handling middleware with a more robust solution.

MAY 26 IMPROVEMENT - Removed latency tracking from API key metrics

Removed the latency column from request tracking and the average latency metric from API key statistics. The API key list view no longer displays average response latency per key, simplifying the metrics shown to focus on request counts (total requests and last week requests) along with status codes.

MAY 26 IMPROVEMENT - Enhanced organization members and invites management with unified endpoint

Merged organization members and invites into a single endpoint that displays both existing members and pending invitations together. The API key list now includes creator information (name and picture), request statistics (total requests and last week’s requests), and average latency metrics for each key. Also added the ability to soft-delete API keys by marking them as inactive instead of permanently removing them.

MAY 25 FIX - Fixed API routing to prevent automatic URL slash redirects

Fixed an issue where API requests with trailing slashes were being automatically redirected (HTTP 307) to URLs without trailing slashes, which could cause problems with certain API clients. The new middleware now properly handles routes regardless of trailing slashes without performing redirects, ensuring more predictable API behavior.

MAY 25 FIX - Fixed API redirect behavior for URLs with missing trailing slashes

Fixed an issue where API requests without trailing slashes were being automatically redirected, causing problems with certain API calls. The API router now handles URLs consistently regardless of whether they end with a trailing slash, preventing unexpected redirect behavior that could break integrations or cause request failures.

MAY 25 FIX - Fixed chat format conversion for messages with unrecognized roles

Fixed an issue in the chat format converter where messages with roles other than ‘user’ or ‘assistant’ would cause errors or be silently dropped. The converter now includes a fallback handler that preserves the content of messages with unrecognized roles, ensuring all chat messages are properly processed.

MAY 23 FIX - Fixed chat message format conversion returning empty strings

Fixed a bug in the chat message format converter where non-list chat messages would result in an empty string being returned. The function now correctly processes individual chat messages by properly assigning the formatted prompt string to the return variable, ensuring chat messages are converted to the expected format (User: … / Assistant: …) even when processing single messages.

MAY 23 FIX - Fixed invite permissions to use correct viewer permission constant

Fixed organization invitation permissions to properly use the standardized PERMISSIONS.VIEWER.ALL constant instead of the hardcoded ‘view:all’ string. This ensures invited users receive the correct viewer-level permissions when joining an organization. Additionally improved permission error messages to show both required and actual user permissions for easier debugging.

MAY 22 IMPROVEMENT - Migrated from role-based to granular permission-based access control

Replaced simple role flags (is_admin, is_editor, is_viewer) with a flexible permission-based system that allows fine-grained access control per resource type. Organization members now have specific permissions (like VIEWER.KEY, EDITOR.KEY) instead of broad role assignments, enabling more precise control over who can view, edit, or manage API keys, logs, and billing information. This change affects all endpoints including keys, logs, billing, and organization management.

MAY 20 NEW - Added user authentication system with post-registration workflows and invitations

Implemented a complete user authentication system including a user table with Auth0 integration, post-registration/post-login webhook handlers that automatically create user records, and an organization invitation system. Users can now be invited to organizations via email with tracked invitation statuses (pending, accepted, declined), and the system handles both email/password and social login (Google, GitHub) authentication methods while keeping user data synchronized with Auth0.

MAY 17 IMPROVEMENT - Renamed 'make cleanup' to 'make clean' and fixed container stopping errors

The Makefile command for cleaning up local development has been renamed from ‘make cleanup’ to ‘make clean’ for consistency with standard conventions. Additionally, fixed race condition errors that occurred when running the cleanup command if containers (redis-stack, pulzeai-db, or pulzeai-backend) were not already running - the command now checks if containers exist before attempting to stop them and provides informative messages.

MAY 17 IMPROVEMENT - Improved RBAC permissions for billing, logs, and organization management

Updated role-based access control (RBAC) permissions across multiple endpoints. Billing and logs endpoints now allow viewer-level access (previously required admin/editor), enabling more team members to view usage data without edit permissions. Organization and API key management endpoints adjusted to require editor permissions instead of admin for delete/update operations. The org list endpoint now returns detailed role information (admin, editor, viewer) for each organization the user belongs to.

MAY 16 NEW - Multi-tenant organization support with database-based permissions for Auth0

Added multi-tenant organization functionality allowing users to belong to multiple organizations with role-based access control. Organizations now support personal and shared workspaces, with member tracking including last login and join dates. API requests are now scoped to organizations, and billing endpoints enforce organization-based permissions (admin and editor roles) instead of user-level authorization.

MAY 16 IMPROVEMENT - Added database seeding script for simplified local development setup

Introduced a new database seeding script (scripts/seed_database.sh) to streamline local development environment setup. Developers can now populate their local database with test data by running a single command after starting the development server. The documentation has been updated with clear instructions to run the seeding script as the second step in the local setup process.

MAY 13 FIX - Fixed CORS configuration to support Vite development server on port 5173

Updated the frontend CORS allowed origins to use port 5173 instead of port 3000, matching the default Vite development server port. This fixes cross-origin request issues when running the frontend locally. The outdated localhost:3000 and local.auth:3000 endpoints have been removed from the allowed origins list.

MAY 13 IMPROVEMENT - Added Poetry installation instructions to README

Enhanced the setup documentation with detailed Poetry installation instructions, including the command and troubleshooting steps for certificate issues. Also clarified that Docker must be installed and running before running the application locally, and corrected the repository URL to the official pulze/api location.

MAY 12 IMPROVEMENT - Improved model selection to respect API key model restrictions in latency mode

Enhanced the latency optimization mode to filter model candidates based on API key model_settings restrictions before evaluating latency metrics. This ensures that when using latency-optimized requests, only models explicitly allowed in your API key configuration will be considered, preventing failed requests to restricted models and improving response reliability.

MAY 11 IMPROVEMENT - Enhanced API authentication to validate token active status

Improved API authentication security by validating that API tokens are both valid and active before allowing access. Previously, the system only checked if a token existed in the database; now it also verifies the token’s is_active status, preventing inactive or revoked tokens from being used. Additionally, updated the unauthorized error message from ‘Bearer token missing or unknown’ to a more concise ‘Unauthorized’ response for both chat completions and completions endpoints.

MAY 11 NEW - Added model routing support with per-key model settings and configuration

API keys can now be configured with allowed model routing settings. When validating API keys, the system now returns model_settings and key_configuration parameters that control which models each key can access, enabling fine-grained access control and model routing at the key level.

MAY 11 IMPROVEMENT - Added rate limiting to API requests (50 requests per minute per API key)

Implemented Redis-based rate limiting for API requests with a hardcoded limit of 50 requests per minute per API key. When the rate limit is exceeded, users will receive a 429 HTTP error with details about their current request count. Rate limits are tracked using one-minute time windows and automatically reset each minute.

MAY 10 FIX - Fixed token verification to properly enforce scopes and permissions

Fixed a security issue where API token scopes and permissions were not being validated during authentication. The system now correctly checks that tokens have the required scopes (space-separated string values) and permissions (list values) before granting access to protected endpoints, preventing unauthorized access to API resources.

MAY 09 NEW - Enhanced API key management with configuration settings and update capability

Added support for customizable model settings and key configurations when creating API keys. Keys now include an is_active status flag for better lifecycle management. Also added the ability to update existing API keys through a new PUT endpoint, allowing users to modify key settings after creation.

MAY 08 FIX - Fixed missing error message when time threshold exceeded before any results

Fixed an issue where no error message was returned when the time threshold was exceeded before generating any results. Now displays a clear error message suggesting to increase the time limit from the current value to double (e.g., if limit is 5s, suggests increasing to 10s), along with information about which execution modes were attempted.

Getting Started

Models

AI Agents

Pulze Guide

Tools Guide

Vibe Coding

Developer Guide

API REFERENCE

COMMUNITY

PULZE ACADEMY

​Changelog

​October 2025

​September 2025

​August 2025

​June 2025

​May 2025

​April 2025

​March 2025

​February 2025

​January 2025

​December 2024

​November 2024

​October 2024

​September 2024

​August 2024

​July 2024

​June 2024

​May 2024

​April 2024

​March 2024

​February 2024

​January 2024

​December 2023

​November 2023

​October 2023

​September 2023

​August 2023

​July 2023

​June 2023

​May 2023